Accurate prediction of amino acid side-chain conformation is an important aspect of protein structure prediction and has application in comparative modeling of proteins, amino acid mutation studies and studies of side-chain re-arrangement in protein-ligand complexes. Contemporary methodologies such as those of Bower, Cohen, and Dunbrack [Bower 1997], Xiang and Honig [Xiang 2001], or Liang and Grishin [Liang 2002] employ a backbone-dependent rotamer library that serves as the set of candidate conformations for side-chains. For given phi-psi angles of the backbone of the residue, the rotamer library contains the rotamers which have been observed in some collection of crystal structures, along with the frequency at which they are observed. To predict the side-chain conformation of a residue, each rotamer is scored based on various characteristics (such as contact surface and volume, electrostatics, desolvation energy) as well as the a priori probability of seeing each rotamer as determined by its frequency in the library.
The central assumption for using such rotamer libraries is that there is enough data in the PDB for each side-chain conformation to attempt statistical determination of conformation preferences lessen the need for detailed energy-based methods (such as force fields). However,
These observations tend to undermine the fundamental assumptions for the use of rotamer libraries. With an infinite training set, every sterically reasonable conformation would be represented and it would be the particular environment of the side-chain that would determine its conformation (not a priori probabilities based upon the backbone alone). In this article we present a methodology for predicting side-chain conformation based upon systematic conformational search of side-chains followed by the application of an environment-based scoring function to rank the conformers. We outline the basic methodology and present some examples of its use.
Rotamers are generated by a systematic conformational search. An estimate is obtained for the change in free energy of solvation between the state where the protein (without rotamer) and the rotamer are in isolation, and the state where they form a complex. This is done by a scoring function that takes into account the changes in the types of surface area contacts. (For example, hydrogen-bonding surfaces inside a hydrophobic area would be one type.) This change in the free energy term is added to the torsional strain energy of the rotamer to compute the deltaG term.
Generation of Rotamers. A systematic conformational search is used to generate side-chain conformations of a particular amino-acid. The systematic conformational search generates side-chain conformations by systematically rotating bonds in a side-chain by discrete increments. The purpose of the search is to generate a collection of reasonable side-chain conformations, which may or may not be at local minima. A generated conformation is rejected if it contains an atom that is involved in a steric clash with another atom of the side-chain or with any atom in its environment (the rest of the protein including solvent and complexed ligands). Two atoms clash if the distance between them is less than 0.6 times the sum of their contact radii (derived from analysis of non-bonded contacts in crystallographic data). Conformations generated by this method may be strained due to bonded interactions as well as some non-bonded strain.
Each rotatable bond is sampled at multiple dihedral angles, and the bond type determines the angles at which the bond is sampled. This is illustrated in figure 1:
| Bond Type | Sampled Dihedral Angles |
| O/S -../td> | {90, -90}, plus or minus 5 degrees |
| sp3 - sp3 | {60, 180, -60}, plus or minus 5 degrees |
| all others | All multiples of 15 degrees |
Figure 1. Sampled dihedral angles
The intent is to sample the main a priori minima along with some internally strained conformations. The sampled dihedral angles were determined by inspection of various rotamer libraries and side-chain conformations derived from crystal structures. Experimentation revealed that these angles (and the 0.6 multiplier on contact radii) produced conformations close to crystal structures in all cases examined.
The strain energy, U, of a conformation is modeled as the torsional strain of the sp3-sp3 rotatable bonds (only) using the AMBER force field parameters. For amino acid side chains this reduces to only two parameters C-C rotatable bonds and C-S rotatable bonds. It is this torsional strain term that is the only model of the a priori rotamer probability; i.e., we take this probability to be proportional to exp(-U/kT).
Classification of Atoms and Surfaces. The rotamers will ultimately be scored by deltaG, the change in free energy of solvation, between the states where the sidechain is in isolation, and when it is in a complex with the environment. We use the term environment to mean the surrounding atoms which interact with the sidechain. Often this will simply be the rest of the protein, but also may include nearby ligands and such.
Each atom in the rotamer and the environment is classified as either: hydrogen-bonding (donor), hydrogen-bonding (acceptor), hydrogen-bonding (both), hydrophobic, or polar. Our method is novel in the sense that our typing does not end at the atom level, but rather, we type each point inside and on the surface of the atom. Each point is said to have three parameters, Chb, Cphob, and Cpol, which may take on values between 0 and 1:
For all points, these three characteristics sum to one, thus a point can be said to be, for example, 0% hydrogen-bonding, 20% hydrophobic, and 80% polar. The method of typing points depends on the atom type to which it belongs and whether or not the atom belongs to a pi-system.
| Hydrogen-bonding | Hydrophobic | |
| hydrogen-bonding characteristic (Chb) | 1 | 0 |
| hydrophobic characteristic (Cphob) | 0 | 1 |
| polar characteristic (Cpol) | 0 | 0 |
Figure 2. Characteristics of points within non-pi-system atoms.
Figure 3. Directional preferences of sp2 atoms
![]() The pi-system of a tyrosine residue. The out-of-plane region prefers hydrophobic (green) contacts, whereas the in-plane region prefers hydrogen-bonding (red) contacts. |
![]() The beta carbon of alanine (non-pi-system atom). The green region indicates the fairly symmetric preference for hydrophobes around the atom. |
Figure 4. Examples of directional preferences
It is possible to give a quantitative description of the distributional preferences of these atoms by recording the distributions of hydrophobic contacts, and that of hydrogen-bonding contacts, with respect to elevation from the sp2 plane. (The acquisition of such statistics is described by Labute [Labute 2001].) These distributions can be well-fit by the following simple distributions:
Figure 5. The white line is the distribution of hydrogen-bonding contacts. The green line is the fit Gamma distribution.
This can be fit with a Gamma distribution with mean mu = 60 degrees and variance sigma = 21 degrees, leading to the following distribution dphob(r):
Figure 6. The white line is the distribution of hydrogen-bonding contacts. The green line is the fit exponential distribution.
This can be fit with an exponential distribution with a mean of 25 degrees, leading to the distribution dhb(r) = e-r/25 / 25.
For any point, the two functions are compared and a preference ratio is computed for the two environments. Specifically, we assign preferences for points at elevation r of dhb(r) / [dhb(r) + dphob(r)] for hydrogen-bonding contacts, and dphob(r) / [dhb(r) + dphob(r)] for hydrophobic contacts. Note that the sum of the two preferences is one.
We then assign characteristics to the points based on the type of the atom to which it belongs.
|
|
Figure 7. Characteristics of points within pi-system atoms.
|
dphob(r) /
[dhb(r) +
dphob(r)]
|
elevation in degrees (r) |
Figure 8. Plot of hydrophobic preference versus elevation
We have now classified all points inside atoms and are able to analyze the contacts between them.
Analyzing Surface Contacts. Two atoms are in contact when their water-accessible surfaces intersect. That is, the contact radii of each atom is expanded by 1.4 angstroms, and any intersection thereof is considered to be a contact, with the exception of 1-2 and 1-3 interactions. Each contact is analyzed by considering each atom surface point and the environment in which it is buried. For each such surface point, its characteristics are multiplied by the characteristics of the atom point with which it forms the contact. These contacts are classified into one of the following seven types based on the two characteristics involved:
This is most easily illustrated by examples. First let us consider the case of two non-pi-system atoms, one hydrogen-bonding and one hydrophobic, shown in figure 9.

With respect to the hydrogen-bonding atom, the point s has
characteristics
As expected, the contact is classified as having characteristic of 1 for
hydrogen-bonding/hydrophobe contact. Since neither atom is polar, we would
expect that there is no polar/hydrophobe contact in figure 9.
To illustrate how the system behaves with pi-system atoms, consider
the contact at the point
s in figure 10, where s is a point on the surface
of a pi-system polar atom which is in contact with a non-pi-system
hydrophobic atom.
The point s, with respect to the polar atom to which it belongs,
has a characteristics
The surface contact types are computed for each point s
on the pi-system atom
surface inside the spherical atom. The point s has
a characteristic based on
its latitude with respect to the center
of the pi-system atom.
Since the hydrophobic atom in our example
is spherical, all points inside it have characteristic
Chb = 0, Cphobic = 1, Cpolar = 0.
The weighted surface area for each type of contact is computed by
integrating the contact characteristics over the surfaces. For example,
the total amount of polar/polar contact is computed by
Note that because the sum of all characteristics
sum to 1, the total contact surface area is the sum of
the amounts of all contact types.
Thus, this scheme is simply a weighting of the
various types of contacts.
Two special cases require additional comment. Firstly, for the case of two
hydrogen-bonding characters interacting, the resulting area is classified
based on whether the interaction is attractive or repulsive as decided by
the atom types.
Secondly, any exposed surface, i.e., not in contact with
another atom, is assumed to be in
contact with solvent, which has fully spherical hydrogen-bonding (both
donor and acceptor) character.
Estimating deltaG.
The Rotamer Explorer estimates deltaG,
the change in free energy of solvation
between two states: protein without sidechain (receptor) and sidechain each
in isolation, and the complex of the two.
For each state, the seven contact types for all contacts between atoms,
and between atoms and solvent, are computed.
This provides seven descriptors, each one being the
change in contact surface area between protein and sidechain each in
isolation, and the complex.
These descriptors were used to model deltaG with a regression fit using
a database of small molecules and their solvation energies. Not all
of the descriptors are well-represented by the data therein; some coefficients
were determined by hand and trial-and-error.
The fit (in kcal/mol per square angstrom of weighted surface area)
was found to be the sum of the following terms:
Although the Rotamer Explorer also
reports the change in hydrophobic surface area (the integral of hydrophobic
character over all exposed surface) and explicit hydrogen bonds, these do
not factor into the deltaG scoring function. They are provided for information
purposes only.
The following images are examples of this method of predicting
conformations of a few long sidechains of PDB protein 1IC6.A, chosen
for its size and high resolution.
In each of the following figures, the native conformation is shown colored
by element. In the left image, the predicted rotamer (the rotamer with the
lowest deltaG) is shown in white. In the right image, all other rotamers
generated by the conformational search are shown.
We have presented the basic methodology of the MOE Rotamer Explorer
along with examples of its use on PDB protein structures.
In general, the Rotamer Explorer
ranks a rotamer close to the native conformation (when ranked by
deltaG) in the top few.
Not only is the generation of rotamers unbiased by a library, but
the scoring method employed by the MOE Rotamer Explorer is also
intuitive. Our scoring function produces a deltaG term from a
regression fit of few descriptors, computed from the surface contacts
and the torsional strain of the rotamer.
Its advantage over several other methods is that instead of classifying each
atom as being either hydrophobic or hydrophilic, it also considers
the directional preferences.
Even with the simplicity of the method, the preliminary results
indicate that our system performs well for the case of a single residue.
Our next step for research will be predicting several side-chain
conformations
simultaneously.
The MOE Rotamer Explorer was written in the
SVL programming language and is available in MOE 2002.03.
(1)(1) + (0)(0) = 1
(0)(0) + (0)(1) = 0

Figure 10. Pi-system polar atom in contact with a
non-pi-system hydrophobic atom
(0.3)(1) + (0.7)(0) = 0.3
![]()
(Cpol (s) w.r.t. atom whose surface
contains s)
(Cpol (s) w.r.t. atom in which s is buried)
ds.
In addition to the change in free energy of solvation, the term for the
torsional strain energy of the rotamer discussed above in the
Generation of Rotamers section was added.
Results
Figure 11. ASP_187
Figure 12. GLN_149
Figure 13. HIS_229
Figure 14. ILE_252
Figure 15. LYS_94
Summary
References
[Bower 2001]
Bower, M., Cohen, F., and Dunbrack, R. Prediction of protein side-chain
rotamers from a backbone-dependent rotamer library: a new homology
modeling tool. Journal of Molecular Biology, 267:1268-1282, 1997.
[Labute 2001]
Labute, P. Receptor Contact Preferences. Journal of the Chemical Computing
Group,
http://www.chemcomp.com/feature/cstat.htm.
[Liang 2002]
Liang, S. and Grishin, N. Side-chain modeling with an optimized
scoring function. Protein Science, 11:322-331, 2002.
[Xiang 2001]
Xiang, Z. and Honig, B. Extending the accuracy limits of prediction
for sidechain conformations. Journal of Molecular Biology, 311:421-430, 2001.