Nearly all biological processes involve the specialized functions of one or more protein molecules. Proteins function to produce other proteins, control all aspects of cellular metabolism, regulate the movement of various molecular and ionic species across membranes, convert and store cellular energy, and carry out many other activities. Essentially all of the information required to initiate, conduct, and regulate each of these functions must be contained in the structure of the protein itself. The previous chapter described the details of primary protein structure. However, proteins do not normally exist as fully extended polypeptide chains but rather as compact, folded structures, and the function of a given protein is rarely if ever dependent only on the amino acid sequence. Instead, the ability of a particular protein to carry out its function in nature is normally determined by its overall three-dimensional shape or conformation. This native, folded structure of the protein is dictated by several factors: (1) interactions with solvent molecules (normally water), (2) the pH and ionic composition of the solvent, and most important, (3) the sequence of the protein. The first two of these effects are intuitively reasonable, but the third, the role of the amino acid sequence, may not be. In ways that are just now beginning to be understood, the primary structure facilitates the development of short-range interactions among adjacent parts of the sequence and also long-range interactions among distant parts of the sequence. Although the resulting overall structure of the complete protein molecule may at first look like a disorganized and random arrangement, it is in nearly all cases a delicate and sophisticated balance of numerous forces that combine to determine the protein’s unique conformation. This chapter considers the details of protein structure and the forces that maintain these structures.
6.1 Forces Influencing Protein Structure
Several different kinds of noncovalent interactions are of vital importance in protein structure. Hydrogen bonds, hydrophobic interactions, electrostatic bonds, and van der Waals forces are all noncovalent in nature, yet are extremely important influences on protein conformations. The stabilization free energies afforded by each of these interactions may be highly dependent on the local environment within the protein, but certain generalizations can still be made.
Hydrogen Bonds
Hydrogen bonds are generally made wherever possible within a given protein structure. In most protein structures that have been examined to date, component atoms of the peptide backbone tend to form hydrogen bonds with one another. Furthermore, side chains capable of forming H bonds are usually located on the protein surface and form such bonds primarily with the water solvent. Although each hydrogen bond may contribute an average of only about 12 kJ/mol in stabilization energy for the protein structure, the number of H- bonds formed in the typical protein is very large. For example, in a-helices, the C=O and N-H groups of every residue participate in H bonds. The importance of H bonds in protein structure cannot be overstated.
Hydrophobic Interactions
Hydrophobic “bonds,” or, more accurately, interactions, form because nonpolar side chains of amino acids and other nonpolar solutes prefer to cluster in a nonpolar environment rather than to intercalate in a polar solvent such as water. The forming of hydrophobic bonds minimizes the interaction of nonpolar residues with water and is therefore highly favorable. Such clustering is entropically driven. The side chains of the amino acids in the interior or core of the protein structure are almost exclusively hydrophobic. Polar amino acids are almost never found in the interior of a protein, but the protein surface may consist of both polar and nonpolar residues.
Electrostatic Interactions
Ionic interactions arise
either as electrostatic attractions between opposite charges or repulsions between
like charges. Chapter 4 discusses
the ionization behavior of amino acids. Amino acid side chains can carry positive
charges, as in the case of lysine, arginine, and histidine, or negative charges,
as in aspartate and glutamate. In addition, the NH2-terminal and
COOH-terminal residues of a protein or peptide chain usually exist in ionized
states and carry positive or negative charges, respectively. All of these may
experience electrostatic interactions in a protein structure. Charged residues
are normally located on the protein surface, where they may interact optimally
with the water solvent. It is energetically unfavorable for an ionized residue
to be located in the hydrophobic core of the protein. Electrostatic interactions
between charged groups on a protein surface are often complicated by the presence
of salts in the solution. For example, the ability of a positively charged lysine
to attract a nearby negative glutamate may be weakened by dissolved NaCl (Figure
6.1). The Na+ and Cl- ions are highly mobile, compact
units of charge, compared to the amino acid side chains, and thus compete effectively
for charged sites on the protein. In this manner, electrostatic interactions
among amino acid residues on protein surfaces may be damped out by high concentrations
of salts. Nevertheless, these interactions are important for protein stability.
Figure 6.1 · An electrostatic interaction between the e-amino group of a lysine and the g-carboxyl group of a glutamate residue.
Van der Waals Interaction
Both attractive forces and repulsive forces are included in van der Waals interactions. The attractive forces are due primarily to instantaneous dipole-induced dipole interactions that arise because of fluctuations in the electron charge distributions of adjacent nonbonded atoms. Individual van der Waals interactions are weak ones (with stabilization energies of 4.0 to 1.2 kJ/mol), but many such interactions occur in a typical protein, and, by sheer force of numbers, they can represent a significant contribution to the stability of a protein. Peter Privalov and George Makhatadze have shown that, for pancreatic ribonuclease A, hen egg white lysozyme, horse heart cytochrome c, and sperm whale myoglobin, van der Waals interactions between tightly packed groups in the interior of the protein are a major contribution to protein stability.
6.2 Role of the Amino Acid Sequence in Protein Structure
It can be inferred from
the first section of this chapter that many different forces work together in
a delicate balance to determine the overall three-dimensional structure of a
protein. These forces operate both within the protein structure itself and between
the protein and the water solvent. How, then, does nature dictate the manner
of protein folding to generate the three-dimensional structure that optimizes
and balances these many forces? All of the information necessary for folding
the peptide chain into its “native” structure is contained in the amino acid
sequence of the peptide. This principle was first appreciated by C. B. Anfinsen
and F. White, whose work in the early 1960s dealt with the chemical denaturation
and subsequent renaturation of bovine pancreatic ribonuclease. Ribonuclease
was first denatured with urea and mercaptoethanol, a treatment that cleaved
the four covalent disulfide (S-S) cross-bridges in the protein. Subsequent air
oxidation permitted random formation of disulfide cross-bridges, most of which
were incorrect. Thus, the air-oxidized material showed little enzymatic activity.
However, treatment of these inactive preparations with small amounts of mercaptoethanol
allowed a reshuffling of the disulfide bonds and permitted formation of significant
amounts of active native enzyme. In such experiments, the only road map for
the protein, that is, the only “instructions” it has, are those directed by
its primary structure, the linear sequence of its amino acid residues.
Just
how proteins recognize and interpret the information that is stored in the polypeptide
sequence is not well understood yet. It may be assumed that certain loci along
the peptide chain act as nucleation points, which initiate folding processes
that eventually lead to the correct structures. Regardless of how this process
operates, it must take the protein correctly to the final native structure,
without getting trapped in a local energy-minimum state which, although stable,
may be different from the native state itself. A long-range goal of many researchers
in the protein structure field is the prediction of three-dimensional conformation
from the amino acid sequence. As the details of secondary and tertiary structure
are described in this chapter, the complexity and immensity of such a prediction
will be more fully appreciated. This area is perhaps the greatest uncharted
frontier remaining in molecular biology.
6.3 Secondary Structure in Proteins
Any discussion of protein folding and structure must begin with the peptide bond, the fundamental structural unit in all proteins. As we saw in Chapter 5, the resonance structures experienced by a peptide bond constrain the oxygen, carbon, nitrogen, and hydrogen atoms of the peptide group, as well as the adjacent a-carbons, to all lie in a plane. The resonance stabilization energy of this planar structure is approximately 88 kJ/mol, and substantial energy is required to twist the structure about the C-N bond. A twist of q degrees involves a twist energy of 88 sin2 q kJ/mol.
Consequences of the Amide Plane
The planarity of the peptide bond means that there are only two degrees of freedom per residue for the peptide chain. Rotation is allowed about the bond linking the a-carbon and the carbon of the peptide bond and also about the bond linking the nitrogen of the peptide bond and the adjacent a-carbon. As shown in Figure 6.2, each a-carbon is the joining point for two planes defined by
Figure
6.2 ·
The amide or peptide bond planes are joined by the tetrahedral bonds of the
a -carbon. The rotation parameters are f
and y . The conformation shown corresponds
to f = 180° and y
= 180°. Note that positive values of f and c correspond to
clockwise rotation as viewed from Ca
. Starting from 0°, a rotation of 180° in the clockwise direction (+180°) is
equivalent to a rotation of 180° in the counterclockwise direction (-180°).
(Irving Geis)
peptide bonds. The angle
about the Ca —N bond is denoted by the
Greek letter f (phi) and that about the Ca
—— Co is denoted by y (psi).
For either of these bond angles, a value of 0° corresponds to an orientation
with the amide plane bisecting the H——Ca
—— R (sidechain) plane and a cis configuration of the main chain around
the rotating bond in question (Figure 6.3).
Figure
6.3 · Many of the possible conformations about an a
-carbon between two peptide planes are forbidden because of steric crowding.
Several noteworthy examples are shown here.
Note: The formal IUPAC-IUB Commission on Biochemical Nomenclature
convention for the definition of the torsion angles f
and y in a polypeptide chain (Biochemistry
9:3471–3479, 1970) is different from that used here, where the C
a atom serves as the point of reference
for both rotations, but the result is the same. (Irving Geis)
In any case, the entire path of the peptide backbone in a protein is known if the f and y rotation angles are all specified. Some values of f and y are not allowed due to steric interference between nonbonded atoms. As shown in Figure 6.4, values of f = 180° and y = 0° are not allowed because of the forbidden overlap of the N-H hydrogens. Similarly, f = 0° and y = 180° are forbidden because of unfavorable overlap between the carbonyl oxygens.
Figure
6.4 · A Ramachandran diagram showing the sterically reasonable values
of the angles f and y
. The shaded regions indicate particularly favorable values of these angles.
Dots in purple indicate actual angles measured for 1000 residues (excluding
glycine, for which a wider range of angles is permitted) in eight proteins.
The lines running across the diagram (numbered +5 through 2 and -5 through -3)
signify the number of amino acid residues per turn of the helix; “+” means right-handed
helices; “-” means left-handed helices. (After Richardson, J. S., 1981,
Advances in Protein Chemistry 34:167–339.)
G. N. Ramachandran and his coworkers in Madras, India, first showed that it was convenient to plot f values against y values to show the distribution of allowed values in a protein or in a family of proteins. A typical Ramachandran plot is shown in Figure 6.4. Note the clustering of f and y values in a few regions of the plot. Most combinations of f and y are sterically forbidden, and the corresponding regions of the Ramachandran plot are sparsely populated. The combinations that are sterically allowed represent the subclasses of structure described in the remainder of this section.
The Alpha-Helix
The discussion of hydrogen
bonding in Section 6.1 pointed out that the carbonyl oxygen
and amide hydrogen of the peptide bond could participate in H bonds either with
water molecules in the solvent or with other H-bonding groups in the peptide
chain. In nearly all proteins, the carbonyl oxygens and the amide protons of
many peptide bonds participate in H bonds that link one peptide group to another,
as shown in Figure 6.5.
These structures tend to form in cooperative fashion and involve substantial portions of the peptide chain. Structures resulting from these interactions constitute secondary structure for proteins (see Chapter 5). When a number of hydrogen bonds form between portions of the peptide chain in this manner, two basic types of structures can result: a-helices and b-pleated sheets.
|
A Deeper Look |
|
| Knowing What the Right Hand and Left Hand Are Doing | |
|
Certain conventions related to peptide bond angles and the “handedness” of biological structures are useful in any discussion of protein structure. To determine the f and y angles between peptide planes, viewers should imagine themselves at the Ca carbon looking outward and should imagine starting from the f = 0°, y = 0° conformation. From this perspective, positive values of f correspond to clockwise rotations about the Ca–N bond of the plane that includes the adjacent N–H group. Similarly, positive |
values of y correspond to clockwise rotations
about the Ca–C bond of the plane
that includes the adjacent C=O group. Biological structures are often said to exhibit “right-hand” or “left-hand” twists. For all such structures, the sense of the twist can be ascertained by holding the structure in front of you and looking along the polymer backbone. If the twist is clockwise as one proceeds outward and through the structure, it is said to be right-handed. If the twist is counterclockwise, it is said to be left-handed. |
Evidence for helical structures
in proteins was first obtained in the 1930s in studies of fibrous proteins.
However, there was little agreement at that time about the exact structure of
these helices, primarily because there was also lack of agreement about interatomic
distances and bond angles in peptides. In 1951, Linus Pauling, Robert Corey,
and their colleagues at the California Institute of Technology summarized a
large volume of crystallographic data in a set of dimensions for polypeptide
chains. (A summary of data similar to what they reported is shown in Figure
5.2.) With these data in hand, Pauling, Corey, and their colleagues proposed
a new model for a helical structure in proteins, which they called the a-helix
. The report from Caltech was of particular interest to Max Perutz in
Cambridge , England,
a crystallographer who was also interested in protein structure. By taking into
account a critical but previously ignored feature of the X-ray data, Perutz
realized that the a-helix existed in keratin, a protein
from hair, and also in several other proteins. Since then, the a-helix
has proved to be a fundamentally important peptide structure. Several representations
of the a-helix are shown in Figure 6.6. One turn
of the helix represents 3.6 amino acid residues. (A single turn of the a-helix
involves 13 atoms from the O to the H of the H bond. For this reason, the a-helix
is sometimes referred to as the 3.6 13 helix.) This is in fact the feature that
most confused crystallographers before the Pauling and Corey a-helix
model. Crystallog-raphers were so accustomed to finding twofold, threefold,
sixfold, and similar integral axes in simpler molecules that the notion of a
nonintegral number of units per turn was never taken seriously before Pauling
and Corey’s work.
Each
amino acid residue extends 1.5 Å (0.15 nm) along the helix axis. With
3.6 residues per turn, this amounts to 3.6 x 1.5 Å or 5.4 Å (0.54
nm) of travel along the helix axis per turn. This is referred to as the
translation distance or the pitch of the helix. If one ignores side chains,
the helix is about 6 Å in diameter. The side chains, extending outward from
the core structure of the
helix, are removed from steric interference with the polypeptide backbone. As
can be seen in
![]()
Figure 6.6 · Four different graphic representations of the a -helix. (a) As it originally appeared in Pauling’s 1960 The Nature of the Chemical Bond. (b) Showing the arrangement of peptide planes in the helix. (c) A space-filling computer graphic presentation. (d) A “ribbon structure” with an inlaid stick figure, showing how the ribbon indicates the path of the polypeptide backbone. (Irving Geis)
Figure 6.6, each peptide
carbonyl is hydrogen bonded to the peptide N—H group four residues farther up
the chain. Note that all of the H bonds lie parallel to the helix axis and
that all of the carbonyl groups are pointing in one direction along the helix
axis while the N-H groups are pointing in the opposite direction. Recall that
the entire path of the peptide backbone can be known if the f
and y twist angles are specified for each residue.
The a-helix is formed if the values of f
are approximately -60° and the values of y are
in the range of -45 to -50°. Figure 6.7 shows
Figure 6.7 · The three-dimensional structures of two proteins that contain substantial amounts of a -helix in their structures. The helices are represented by the regularly coiled sections of the ribbon drawings. Myohemery-thrin is the oxygen-carrying protein in certain invertebrates, including Sipunculids, a phylum of marine worm. (Jane Richardson)
the structures of two
proteins that contain a-helical segments. The number
of residues involved in
a given a-helix varies from helix to helix and from
protein to protein. On average, there are about 10 residues per helix. Myoglobin,
one of the first proteins in which a-helices
were observed, has eight stretches of a-helix that
form a box to contain the heme prosthetic group. The structures of the a
and b subunits of hemoglobin are strikingly similar,
with only a few differences at the C- and N-termini and on the surfaces of the
structure that contact or interact with the other subunits of this multisubunit
protein.
As shown
in Figure 6.6, all of the hydrogen bonds point in the
same direction along the a-helix axis. Each peptide
bond posesses a dipole moment that arises from the polarities of the N-H and
C=O groups, and, because these groups are all aligned along the helix axis,
the helix itself has a substantial dipole moment, with a partial positive charge
at the N-terminus and a partial negative charge at the C-terminus (Figure 6.8).
Negatively charged ligands (e.g., phosphates) frequently bind to proteins near
the N-terminus of an a-helix. By contrast, positively
charged ligands are only rarely found to bind near the C-terminus of an a-helix.
Figure 6.8 · The arrangement of N–H and C=O groups (each with an individual dipole moment) along the helix axis creates a large net dipole for the helix. Numbers indicate fractional charges on respective atoms.
In a typical
a-helix of 12 (or n) residues, there are 8
(or n - 4) hydrogen bonds. As shown in Figure 6.9, the first 4 amide
hydrogens and the last 4 carbonyl oxygens cannot participate in helix H-bonds.
Also, nonpolar residues situated near the helix termini can be exposed to solvent.
Proteins frequently compensate for these problems by helix capping—providing
H-bond partners for the otherwise bare N-H and C=O groups and folding other
parts of the protein to foster hydrophobic contacts with exposed nonpolar residues
at the helix termini.
Figure 6.9 · Four N–H groups at the N-terminal end of an a-helix and four C=O groups at the C-terminal end cannot participate in hydrogen bonding. The formation of H-bonds with other nearby donor and acceptor groups is referred to as helix capping. Capping may also involve appropriate hydrophobic interactions that accomodate non-polar side chains at the ends of helical segments.
Careful studies of the polyamino acids, polymers in which all the amino acids are identical, have shown that certain amino acids tend to occur in a-helices, whereas others are less likely to be found in them. Polyleucine and polyalanine, for example, readily form a-helical structures. In contrast, polyaspartic acid and polyglutamic acid, which are highly negatively charged at pH 7.0, form only random structures because of strong charge repulsion between the R groups along the peptide chain. At pH 1.5 to 2.5, however, where the side chains are protonated and thus uncharged, these latter species spontaneously form a-helical structures. In similar fashion, polylysine is a random coil at pH values below about 11, where repulsion of positive charges prevents helix formation. At pH 12, where polylysine is a neutral peptide chain, it readily forms an a-helix.
The tendencies of the amino acids to stabilize or destabilize a-helices
are different in typical proteins than in polyamino acids. The occurrence of
the common amino acids in helices is summarized in Table 6.1. Notably, proline
(and hydroxyproline) act as helix breakers due to their
Other Helical Structures
There are several other far less common types of helices found in proteins. The most common of these is the 310 helix, which contains 3.0 residues per turn (with 10 atoms in the ring formed by making the hydrogen bond three residues up the chain). It normally extends over shorter stretches of sequence than the a-helix. Other helical structures include the 27 ribbon and the p-helix, which has 4.4 residues and 16 atoms per turn and is thus called the 4.416 helix.
The Beta-Pleated Sheet
Another type of structure
commonly observed in proteins also forms because of local, cooperative formation
of hydrogen bonds. That is the pleated sheet, or b-structure,
often called the b-pleated sheet. This structure
was also first postulated by Pauling and Corey in 1951 and has now been observed
in many natural proteins. A b-pleated sheet can be
visualized by laying thin, pleated strips of paper side by side to make a “pleated
sheet” of paper (Figure 6.10). Each
Figure
6.10 ·
A “pleated sheet” of paper with an antiparallel b
-sheet drawn on it. (Irving Geis) ![]()
strip of paper can then
be pictured as a single peptide strand in which the peptide backbone makes a
zigzag pattern along the strip, with the a-carbons
lying at the folds of the pleats. The pleated sheet can exist in both parallel
and antiparallel forms. In the parallel b-pleated
sheet, adjacent chains run in the same direction (
Figure 6.11 · The arrangement of hydrogen bonds in (a) parallel and (b) antiparallel b -pleated sheets.
Parallel
b-sheets tend to be more regular than antiparallel
b-sheets. The range of f
and y angles for the peptide bonds in parallel
sheets is much smaller than that for antiparallel sheets. Parallel sheets are
typically large structures; those composed of less than five strands are rare.
Antiparallel sheets, however, may consist of as few as two strands. Parallel
sheets characteristically distribute hydrophobic side chains on both sides of
the sheet, while antiparallel sheets are usually arranged with all their hydrophobic
residues on one side of the sheet. This requires an alternation of hydrophilic
and hydrophobic residues in the primary structure of peptides involved in antiparallel
b-sheets because alternate side chains project to
the same side of the sheet (see Figure 6.10).
Antiparallel
pleated sheets are the fundamental structure found in silk, with the polypeptide
chains forming the sheets running parallel to the silk fibers. The silk fibers
thus formed have properties consistent with those of the b-sheets
that form them. They are quite flexible but cannot be stretched or extended
to any appreciable degree. Antiparallel structures are also observed in many
other proteins, including immunoglobulin G, superoxide dismutase from bovine
erythrocytes, and concanavalin A. Many proteins, including carbonic
anhydrase, egg lysozyme, and glyceraldehyde phosphate dehydrogenase, possess
both a-helices and b-pleated
sheet structures within a single polypeptide chain.
The Beta-Turn
Most proteins are globular
structures. The polypeptide chain must therefore possess the capacity to bend,
turn, and reorient itself to produce the required compact, globular structures.
A simple structure observed in many proteins is the b-turn
(also known as the tight turn or b-bend),
in which the peptide chain forms a tight loop with the carbonyl oxygen of one
residue hydrogen-bonded with the amide proton of the residue three positions
down the chain. This H bond makes the b-turn a relatively
stable structure. As shown in Figure 6.12, the b-turn
allows the protein to reverse the
Figure
6.12 ·
The structures of two kinds of b -turns (also
called tight turns or b -bends) (Irving
Geis)
![]()
direction of its peptide chain. This figure shows the two major types of b-turns, but a number of less common types are also found in protein structures. Certain amino acids, such as proline and glycine, occur frequently in b-turn sequences, and the particular conformation of the b-turn sequence depends to some extent on the amino acids composing it. Due to the absence of a side chain, glycine is sterically the most adaptable of the amino acids, and it accommodates conveniently to other steric constraints in the b-turn. Proline, however, has a cyclic structure and a fixed f angle, so, to some extent, it forces the formation of a b-turn, and in many cases this facilitates the turning of a polypeptide chain upon itself. Such bends promote formation of antiparallel b-pleated sheets.
The Beta-Bulge
One final secondary structure,
the b-bulge , is a small piece of nonrepetitive
structure that can occur by itself, but most often occurs as an irregularity
in antiparallel b-structures. A b-bulge
occurs between two normal b-structure hydrogen bonds
and comprises two residues on one strand and one residue on the opposite strand.
Figure 6.13 illustrates typical b-bulges. The extra
Figure 6.13 · Three different kinds of b-bulge structures involving a pair of adjacent polypeptide chains.(Adapted from Richardson, J. S., 1981. Advances in Protein Chemistry 34:167–339.)
residue on the longer
side, which causes additional backbone length, is accommodated partially by
creating a bulge in the longer strand and partially by forcing a slight bend
in the b-sheet. Bulges thus cause changes in the
direction of the polypeptide chain, but to a lesser degree than tight turns
do. Over 100 examples of b-bulges are known in protein
structures.
The
secondary structures we have described here are all found commonly in proteins
in nature. In fact, it is hard to find proteins that do not contain one or more
of these structures. The energetic (mostly H-bond) stabilization afforded by
a-helices, b-pleated sheets,
and b-turns is important to proteins, and they seize
the opportunity to form such structures wherever possible.
6.4 Protein Folding and Tertiary Structure
The folding of a single
polypeptide chain in three-dimensional space is referred to as its tertiary
structure. As discussed in Section 6.2, all of the information
needed to fold the protein into its native tertiary structure is contained within
the primary structure of the peptide chain itself. With this in mind, it was
disappointing to the biochemists of the 1950s when the early protein structures
did not reveal the governing principles in any particular detail. It soon became
apparent that the proteins knew how they were supposed to fold into tertiary
shapes, even if the biochemists did not. Vigorous work in many laboratories
has slowly brought important principles to light.
First,
secondary structures—helices and sheets—form whenever possible as a consequence
of the formation of large numbers of hydrogen bonds. Second, a-helices
and b-sheets often associate and pack close together
in the protein. No protein is stable as a single-layer structure, for reasons
that become apparent later. There are a few common methods for such packing
to occur. Third, because the peptide segments between secondary structures in
the protein tend to be short and direct, the peptide does not execute complicated
twists and knots as it moves from one region of a secondary structure to another.
A consequence of these three principles is that protein chains are usually folded
so that the secondary structures are arranged in one of a few common patterns.
For this reason, there are families of proteins that have similar tertiary structure,
with little apparent evolutionary or functional relationship among them. Finally,
proteins generally fold so as to form the most stable structures possible. The
stability of most proteins arises from (1) the formation of large numbers of
intramolecular hydrogen bonds and (2) the reduction in the surface area accessible
to solvent that occurs upon folding.
Fibrous Proteins
In Chapter 5, we saw that proteins can be grouped into three large classes based on their structure and solubility: fibrous proteins, globular proteins, and membrane proteins. Fibrous proteins contain polypeptide chains organized approximately parallel along a single axis, producing long fibers or large sheets. Such proteins tend to be mechanically strong and resistant to solubilization in water and dilute salt solutions. Fibrous proteins often play a structural role in nature (see Chapter 5).
a-Keratin
As their name suggests,
the structure of the a-keratins is dominated by a-helical
segments of polypeptide. The amino acid sequence of a-keratin
subunits is composed of central a-helix—rich
rod domains about 311 to 314 residues in length, flanked by nonhelical N- and
C-terminal domains of varying size and composition (Figure 6.14a). The structure
of the central rod domain of a typical a-keratin
is shown in Figure 6.14b. It consists of four helical strands arranged as twisted
pairs of two-stranded coiled coils. X-ray diffraction patterns show that
these structures resemble a-helices, but with a pitch
of 0.51 nm rather than the expected 0.54 nm. This is consistent with a tilt
of the helix relative to the long axis of the fiber, as in the two-stranded
“rope” in Figure 6.14.
Figure 6.14 · (a) Both type I and type II a -keratin molecules have sequences consisting of long, central rod domains with terminal cap domains. The numbers of amino acid residues in each domain are indicated. Asterisks denote domains of variable length. (b) The rod domains form coiled coils consisting of intertwined right-handed a -helices. These coiled coils then wind around each other in a left-handed twist. Keratin filaments consist of twisted protofibrils (each a bundle of four coiled coils). (Adapted from Steinert, P., and Parry, D., 1985. Annual Review of Cell Biology 1:41–65; and Cohlberg, J., 1993. Trends in Biochemical Sciences 18:360–362.)
The primary structure of the central rod segments of a-keratin consists of quasi-repeating seven-residue segments of the form (a-b-c-d-e-f-g)n. These units are not true repeats, but residues a and d are usually nonpolar amino acids. In a-helices, with 3.6 residues per turn, these nonpolar residues are arranged in an inclined row or stripe that twists around the helix axis. These nonpolar residues would make the helix highly unstable if they were exposed to solvent, but the association of hydrophobic strips on two coiled coils to form the two-stranded rope effectively buries the hydrophobic residues and forms a highly stable structure (Figure 6.14). The helices clearly sacrifice some stability in assuming this twisted conformation, but they gain stabilization energy from the packing of side chains between the helices. In other forms of keratin, covalent disulfide bonds form between cysteine residues of adjacent molecules, making the overall structure rigid, inextensible, and insoluble—important properties for structures such as claws, fingernails, hair, and horns in animals. How and where these disulfides form determines the amount of curling in hair and wool fibers. When a hairstylist creates a permanent wave (simply called a “permanent”) in a hair salon, disulfides in the hair are first reduced and cleaved, then reorganized and reoxidized to change the degree of curl or wave. In contrast, a “set” that is created by wetting the hair, setting it with curlers, and then drying it represents merely a rearrangement of the hydrogen bonds between helices and between fibers. (On humid or rainy days, the hydrogen bonds in curled hair may rearrange, and the hair becomes “frizzy.”)
Fibroin and b-Keratin: b-Sheet Proteins
The fibroin proteins
found in silk fibers represent another type of fibrous protein. These are composed
of stacked antiparallel b-sheets, as shown in Figure
6.15. In the polypeptide sequence
Figure 6.15 · Silk fibroin consists of a unique stacked array of b -sheets. The primary structure of fibroin molecules consists of long stretches of alternating glycine and alanine or serine residues. When the sheets stack, the more bulky alanine and serine residues on one side of a sheet interdigitate with similar residues on an adjoining sheet. Glycine hydrogens on the alternating faces interdigitate in a similar manner, but with a smaller intersheet spacing. (Irving Geis)
of silk proteins, there are large stretches in which every other residue is a glycine. As previously mentioned, the residues of a b-sheet extend alternately above and below the plane of the sheet. As a result, the glycines all end up on one side of the sheet and the other residues (mainly alanines and serines) compose the opposite surface of the sheet. Pairs of b-sheets can then pack snugly together (glycine surface to glycine surface or alanine—serine surface to alanine—serine surface). The b-keratins found in bird feathers are also made up of stacked b-sheets.
Collagen: A Triple Helix
Collagen is a
rigid, inextensible fibrous protein that is a principal constituent of connective
tissue in animals, including tendons, cartilage, bones, teeth, skin, and blood
vessels. The high tensile strength of collagen fibers in these structures makes
possible the various animal activities such as running and jumping that put
severe stresses on joints and skeleton. Broken bones and tendon and cartilage
injuries to knees, elbows, and other joints involve tears or hyperextensions
of the collagen matrix in these tissues.
The
basic structural unit of collagen is tropocollagen, which has a molecular
weight of 285,000 and consists of three intertwined polypeptide chains, each
about 1000 amino acids in length. Tropocollagen molecules are about 300 nm long
and only about 1.4 nm in diameter. Several kinds of collagen have been identified.
Type I collagen, which is the most common, consists of two identical
peptide chains designated a1(I) and one different
chain designated a 2(I). Type I collagen predominates
in bones, tendons, and skin. Type II collagen, found in cartilage, and
type III collagen, found in blood vessels, consist of three identical polypeptide
chains.
Collagen
has an amino acid composition that is unique and is crucial to its three-dimensional
structure and its characteristic physical properties. Nearly one residue out
of three is a glycine, and the proline content is also unusually high. Three
unusual modified amino acids are also found in collagen: 4-hydroxy-proline (Hyp),
3-hydroxyproline, and 5-hydroxylysine (Hyl) (Figure 6.16).
Proline and Hyp together compose up to 30% of the residues of collagen.
Figure 6.16 · The hydroxylated residues typically found in collagen.
Interestingly, these three amino acids are formed from normal proline and lysine
after the collagen polypeptides are synthesized. The modifications are
effected by two enzymes: prolyl hydroxylase and lysyl hydroxylase.
The prolyl hydroxylase reaction (Figure 6.17) requires molecular oxygen,
Figure 6.17 · Hydroxylation of proline residues is catalyzed by prolyl hydroxylase. The reaction requires a -ketoglutarate and ascorbic acid (vitamin C).
a-ketoglutarate, and ascorbic acid (vitamin C) and
is activated by Fe2+. The hydroxylation of lysine is similar. These
processes are referred to as posttranslational modifications because
they occur after genetic information from DNA has been translated into newly
formed protein.
Because of
their high content of glycine, proline, and hydroxyproline, collagen fibers
are incapable of forming traditional structures such as a-helices
and b-sheets. Instead, collagen polypeptides intertwine
to form a unique triple helix, with each of the three strands arranged
in a helical fashion (Figure 6.18). Compared to the a-helix,
the collagen helix is much more extended,
Figure
6.18 ·
Poly(Gly-Pro-Pro), a collagen-like right-handed triple helix composed of three
left-handed helical chains. (Adapted from Miller, M. H., and Scheraga, H.
A., 1976, Calculation of the structures of collagen models. Role of
interchain interactions in determining the triple-helical coiled-coil conformation.
I. Poly(glycyl-prolyl-prolyl). Journal of Polymer Science Symposium 54:171–200.)
with a rise per residue
along the triple helix axis of 2.9 Å, compared to 1.5 Å for the a-helix.
There are about 3.3 residues per turn of each of these helices. The triple
helix is a structure that forms to accommodate the unique composition and sequence
of collagen. Long stretches of the polypeptide sequence are repeats of a
Gly-x-y motif, where x is frequently Pro and y is frequently Pro or Hyp. In
the triple helix, every third residue faces or contacts the crowded center of
the structure. This area is so crowded that only Gly can fit, and thus every
third residue must be a Gly (as observed). Moreover, the triple helix is a staggered
structure, such that Gly residues from the three strands stack along the center
of the triple helix and the Gly from one strand lies adjacent to an x residue
from the second strand and to a y from the third. This allows the N-H of each
Gly residue to hydrogen bond with the C=O of the adjacent x residue. The triple
helix structure is further stabilized and strengthened by the formation of interchain
H bonds involving hydroxyproline.
Collagen
types I, II, and III form strong, organized fibrils, consisting of staggered
arrays of tropocollagen molecules (Figure 6.19). The periodic arrangement of
triple helices in a head-to-
Figure 6.19 · In the electron microscope, collagen fibers exhibit alternating light and dark bands. The dark bands correspond to the 40-nm gaps or “holes” between pairs of aligned collagen triple helices. The repeat distance, d, for the light- and dark-banded pattern is 68 nm. The collagen molecule is 300 nm long, which corresponds to 4.41d. The molecular repeat pattern of five staggered collagen molecules corresponds to 5d. (J. Gross, Biozentrum/Science Photo Library)
tail fashion results in
banded patterns in electron micrographs. The banding pattern typically has a
periodicity (repeat distance) of 68 nm. Because collagen triple helices are
300 nm long, 40-nm gaps occur between adjacent collagen molecules in a row along
the long axis of the fibrils and the pattern repeats every five rows (5 x 68
nm = 340 nm). The 40-nm gaps are referred to as hole regions, and they
are important in at least two ways. First, sugars are found covalently attached
to 5-hydroxylysine residues in the hole regions of collagen (Figure 6.20). The
occurrence of carbohydrate in the hole

Figure 6.20 · A disaccharide of galactose and glucose is covalently linked to the 5-hydroxyl group of hydroxylysines in collagen by the combined action of the enzymes galactosyl transferase and glucosyl transferase.
region has led to the
proposal that it plays a role in organizing fibril assembly. Second, the hole
regions may play a role in bone formation. Bone consists of microcrystals of
hydroxyapatite, Ca5(PO4)3OH, embedded
in a matrix of collagen fibrils. When new bone tissue forms, the formation of
new hydroxyapatite crystals occurs at intervals of 68 nm. The hole regions of
collagen fibrils may be the sites of nucleation for the mineralization of bone.
The collagen fibrils are further strengthened and stabilized by the formation of both intramolecular (within a tropocollagen molecule) and intermolecular (between tropocollagen molecules in the fibril) cross-links. Intramolecular cross-links are formed between lysine residues in the (nonhelical) N-terminal region of tropocollagen in a unique pair of reactions shown in Figure 6.21. The enzyme lysyl oxidase catalyzes the formation of aldehyde groups at the lysine
Figure
6.21 ·
Collagen fibers are stabilized and strengthened by
side chains in a copper-dependent reaction. The aldehyde groups of two such side chains then link covalently in a spontaneous nonenzymatic aldol condensation. The intermolecular cross-linking of tropocollagens involves the formation of a unique hydroxypyridinium structure from one lysine and two hydroxylysine residues (Figure 6.22). These cross-links form between the N-terminal region of one tropocollagen and the C-terminal region of an adjacent tropocollagen in the fibril.
Figure 6.22 · The hydroxypyridinium structure formed by the cross-linking of a
Globular Proteins
Fibrous proteins, although interesting for their structural properties, represent only a small percentage of the proteins found in nature. Globular proteins, so named for their approximately spherical shape, are far more numerous.
Helices and Sheets in Globular Proteins
Globular proteins exist in an enormous variety of three-dimensional structures, but nearly all contain substantial amounts of the a-helices and b-sheets that form the basic structures of the simple fibrous proteins. For example, myoglobin, a small, globular, oxygen-carrying protein of muscle (17 kD, 153 amino acid residues), contains eight a-helical segments, each containing 7 to 26 amino acid residues. These are arranged in an apparently irregular (but invariant) fashion (see Figure 5.7). The space between the helices is filled efficiently and tightly with (mostly hydrophobic) amino acid side chains. Most of the polar side chains in myoglobin (and in most other globular proteins) face the outside of the protein structure and interact with solvent water. Myoglobin’s structure is unusual because most globular proteins contain a relatively small amount of a-helix. A more typical globular protein (Figure 6.23) is bovine ribonuclease A, a small protein
Figure 6.23 · The three-dimensional structure of bovine ribonuclease A, showing the a -helices as ribbons. (Jane Richardson)
(14.6 kD, 129 residues) that contains a few short helices, a broad section of antiparallel b-sheet, a few b-turns, and several peptide segments without defined secondary structure.
Why should the cores of most globular and membrane proteins consist almost entirely of a-helices and b-sheets? The reason is that the highly polar N-H and C=O moieties of the peptide backbone must be neutralized in the hydrophobic core of the protein. The extensively H-bonded nature of a-helices and b-sheets is ideal for this purpose, and these structures effectively stabilize the polar groups of the peptide backbone in the protein core.
In globular protein structures, it is common for one face of an a-helix to be exposed to the water solvent, with the other face toward the hydrophobic interior of the protein. The outward face of such an amphiphilic helix consists mainly of polar and charged residues, whereas the inward face contains mostly nonpolar, hydrophobic residues. A good example of such a surface helix is that of residues 153 to 166 of flavodoxin from Anabaena (Figure 6.24). Note that the helical wheel presentation of this helix readily shows that one face contains four hydrophobic residues and that the other is almost entirely polar and charged.
Figure 6.24 · (a) The alpha helix consisting of residues 153–166 (red) in flavodoxin from Anabaena is a surface helix and is amphipathic. (b) The two helices (yellow and blue) in the interior of the citrate synthase dimer (residues 260–270 in each monomer) are mostly hydrophobic. (c) The exposed helix (residues 74–87—red) of calmodulin is entirely accessible to solvent and consists mainly of polar and charged residues.
Less commonly, an a-helix can be completely buried in the protein interior or completely exposed to solvent. Citrate synthase is a dimeric protein in which a -helical segments form part of the subunit—subunit interface. As shown in Figure 6.24, one of these helices (residues 260 to 270) is highly hydrophobic and contains only two polar residues, as would befit a helix in the protein core. On the other hand, Figure 6.24 also shows the solvent-exposed helix (residues 74 to 87) of calmodulin, which consists of 10 charged residues, 2 polar residues, and only 2 nonpolar residues.
Packing Considerations
The secondary and tertiary structures of myoglobin and ribonuclease A illustrate the importance of packing in tertiary structures. Secondary structures pack closely to one another and also intercalate with (insert between) extended polypeptide chains. If the sum of the van der Waals volumes of a protein’s constituent amino acids is divided by the volume occupied by the protein, packing densities of 0.72 to 0.77 are typically obtained. This means that, even with close packing, approximately 25% of the total volume of a protein is not occupied by protein atoms. Nearly all of this space is in the form of very small cavities. Cavities the size of water molecules or larger do occasionally occur, but they make up only a small fraction of the total protein volume. It is likely that such cavities provide flexibility for proteins and facilitate conformation changes and a wide range of protein dynamics (discussed later).
Ordered, Nonrepetitive Structures
In any protein structure,
the segments of the polypeptide chain that cannot be classified as defined secondary
structures, such as helices or sheets, have been traditionally referred to as
coil or random coil. Both these terms are misleading. Most of
these segments are neither coiled nor random, in any sense of the words. These
structures are every bit as highly organized and stable as the defined secondary
structures. They are just more variable and difficult to describe. These so-called
coil structures are strongly influenced by side-chain interactions. Few of these
interactions are well understood, but a number of interesting cases have been
described. In his early studies of myoglobin structure, John Kendrew found that
the -OH group of threonine or serine often forms a hydrogen bond with a backbone
NH at the beginning of an a-helix. The same stabilization
of an a-helix by a serine is observed in the three-dimensional
structure of pancreatic trypsin inhibitor (Figure 6.25). Also in this same structure,
an asparagine residue adjacent to a b-strand is found
to form H bonds that stabilize the b-structure.
Nonrepetitive
but well-defined structures of this type form many important features of enzyme
active sites. In some cases, a particular arrangement of “coil” structure providing
a specific type of functional site recurs in several functionally related proteins.
The peptide loop that binds iron—sulfur clusters in both ferredoxin and high
potential iron protein is one example. Another is the central loop portion of
the E-F hand structure that binds a calcium ion in several calcium-binding
proteins, including calmodulin, carp parvalbumin, troponin C, and the intestinal
calcium-binding protein. This loop, shown in Figure 6.26, connects two short
a-helices. The calcium ion nestles into the pocket
formed by this structure.
Figure 6.26 · A representation of the so-called E–F hand structure, which forms calcium-binding sites in a variety of proteins. The stick drawing shows the peptide backbone of the E–F hand motif. The “E” helix extends along the index finger, a loop traces the approximate arrangement of the curled middle finger, and the “F” helix extends outward along the thumb. A calcium ion (Ca2+) snuggles into the pocket created by the two helices and the loop. Kretsinger and coworkers originally assigned letters alphabetically to the helices in parvalbumin, a protein from carp. The E–F hand derives its name from the letters assigned to the helices at one of the Ca2+-binding sites.
Flexible, Disordered Segments
In addition to nonrepetitive but well-defined structures, which exist in all proteins, genuinely disordered segments of polypeptide sequence also occur. These sequences either do not show up in electron density maps from X-ray crystallographic studies or give diffuse or ill-defined electron densities. These segments either undergo actual motion in the protein crystals themselves or take on many alternate conformations in different molecules within the protein crystal. Such behavior is quite common for long, charged side chains on the surface of many proteins. For example, 16 of the 19 lysine side chains in myoglobin have uncertain orientations beyond the d-carbon, and five of these are disordered beyond the b-carbon. Similarly, a majority of the lysine residues are disordered in trypsin, rubredoxin, ribonuclease, and several other proteins. Arginine residues, however, are usually well ordered in protein structures. For the four proteins just mentioned, 70% of the arginine residues are highly ordered, compared to only 26% of the lysines.
Motion in Globular Proteins
Although we have distinguished
between well-ordered and disordered segments of the polypeptide chain, it is
important to realize that even well-ordered side chains in a protein undergo
motion, sometimes quite rapid. These motions should be viewed as momentary oscillations
about a single, highly stable conformation. Proteins are thus best viewed
as dynamic structures. The allowed motions may be motions of individual
atoms, groups of atoms, or even whole sections of the protein. Furthermore,
they may arise from either thermal energy or specific, triggered conformational
changes in the protein. Atomic fluctuations such as vibrations typically
are random, very fast, and usually occur over small distances (less than 0.5
Å), as shown in Table 6.2. These motions arise from the kinetic energy within
the protein and are a function of temperature. These very fast motions can be
modeled by molecular dynamics calculations and studied by X-ray diffraction.
A class of
slower motions, which may extend over larger distances, is collective motions.
These are movements of groups of atoms covalently linked in such a way that
the group moves as a unit. Such groups range in size from a few atoms to hundreds
of atoms. Whole structural domains within a protein may be involved, as in the
case of the flexible antigen-binding domains of immunoglobulins, which move
as relatively rigid units to selectively bind separate antigen molecules. Such
motions are of two types—(1) those that occur quickly but infrequently, such
as tyrosine ring flips, and (2) those that occur slowly, such as cis-trans
isomerizations of prolines. These collective motions also arise from thermal
energies in the protein and operate on a time scale of 10-12 to 10-3
sec. These motions can be studied by nuclear magnetic resonance (NMR) and fluorescence
spectroscopy.
Conformational
changes involve motions of groups of atoms (individual side chains, for
example) or even whole sections of proteins. These motions occur on a time scale
of 10-9 to 103 sec, and the distances covered can be as
large as 1 nm. These motions may occur in response to specific stimuli or arise
from specific interactions within the protein, such as hydrogen bonding, electrostatic
interactions, and ligand binding. More will be said about conformational changes
when enzyme catalysis and regulation are discussed (see Chapters
14 and 15).
Forces Driving the Folding of Globular Proteins
As already pointed out,
the driving force for protein folding and the resulting formation of a tertiary
structure is the formation of the most stable structure possible. Two forces
are at work here. The peptide chain must both (1) satisfy the constraints inherent
in its own structure and (2) fold so as to “bury” the hydrophobic side chains,
minimizing their contact with solvent. The polypeptide itself does not usually
form simple straight chains. Even in chain segments where helices and sheets
are not formed, an extended peptide chain, being composed of L-amino
acids, has a tendency to twist slightly in a right-handed direction. As shown
in Figure 6.27, this tendency
Figure 6.27 · The natural right-handed twist exhibited by polypeptide chains, and the variety of structures that arise from this twist.
is apparently the basis
for the formation of a variety of tertiary structures having a right-handed
sense. Principal among these are the right-handed twists in arrays of b-sheets
and right-handed cross-overs in parallel b-sheet
arrays. Right-handed twisted b-sheets are found at
the center of a number of proteins and provide an extended, highly stable structural
core. Phosphoglycerate mutase, adenylate kinase,
and carbonic anhydrase, among others, exist
as smoothly twisted planes or saddle-shaped structures. Triose phosphate isomerase,
soybean trypsin inhibitor, and domain 1 of pyruvate kinase contain right-handed
twisted cylinders or barrel structures at their cores.
Connections
between b-strands are of two types—hairpins and cross-overs.
Hairpins, as shown in Figure 6.27, connect adjacent antiparallel b-strands.
Cross-overs are necessary to connect adjacent (or nearly adjacent) parallel
b-strands. Nearly all cross-over structures are right-handed.
Only in subtilisin and phosphoglucoisomerase have isolated left-handed cross-overs
been identified. In many cross-over structures, the cross-over connection itself
contains an a-helical segment. This is referred to
as a bab-loop. As shown in Figure 6.27, the strong tendency in nature
to form right-handed cross-overs, the wide occurrence of a-helices
in the cross-over connection, and the right-handed twists of b-sheets
can all be understood as arising from the tendency of an extended polypeptide
chain of L-amino acids to adopt a right-handed twist structure. This
is a chiral effect. Proteins composed of D-amino acids would tend
to adopt left-handed twist structures.
The
second driving force that affects the folding of polypeptide chains is the need
to bury the hydrophobic residues of the chain, protecting them from solvent
water. From a topological viewpoint, then, all globular proteins must have an
“inside” where the hydrophobic core can be arranged and an “outside” toward
which the hydrophilic groups must be directed. The sequestration of hydrophobic
residues away from water is the dominant force in the arrangement of secondary
structures and nonrepetitive peptide segments to form a given tertiary structure.
Globular proteins can be classified mainly on the basis of the particular kind
of core or backbone structure they use to accomplish this goal. The term hydrophobic
core, as used here, refers to a region in which hydrophobic side chains
cluster together, away from the solvent. Backbone refers to the polypeptide
backbone itself, excluding the particular side chains. Globular proteins can
be pictured as consisting of “layers” of backbone, with hydrophobic core regions
between them. Over half the known globular protein structures have two layers
of backbone (separated by one hydrophobic core). Roughly one-third of the known
structures are composed of three backbone layers and two hydrophobic cores.
There are also a few known four-layer structures and one known five-layer structure.
A few structures are not easily classified in this way, but it is remarkable
that most proteins fit into one of these classes. Examples of each are presented
in Figure 6.28.
Figure 6.28 · Examples of protein domains with different numbers of layers of backbone structure. (a) Cytochrome c' with two layers of a-helix. (b) Domain 2 of phosphoglycerate kinase, composed of a b -sheet layer between two layers of helix, three layers overall. (c) An unusual five-layer structure, domain 2 of glycogen phosphorylase, a b -sheet layer sandwiched between four layers of a -helix. (d) The concentric “layers” of b -sheet (inside) and a -helix (outside) in triose phosphate isomerase. Hydrophobic residues are buried between these concentric layers in the same manner as in the planar layers of the other proteins. The hydrophobic layers are shaded yellow. (Jane Richardson)
Classification of Globular Proteins
In addition to classification based on layer structure, proteins can be grouped according to the type and arrangement of secondary structure. There are four such broad groups: antiparallel a-helix, parallel or mixed b-sheet, antiparallel b-sheet, and the small metal- and disulfide-rich proteins.
It is important to note that the similarities of tertiary structure within these groups do not necessarily reflect similar or even related functions. Instead, functional homology usually depends on structural similarities on a smaller and more intimate scale.
Antiparallel
a-helix proteins are structures heavily
dominated by a-helices. The simplest way to pack
helices is in an antiparallel manner, and most of the proteins in this class
consist of bundles of antiparallel helices. Many of these exhibit a slight (15°)
left-handed twist of the helix bundle. Figure 6.29 shows a representative sample
of antiparallel a-helix
proteins. Many of these are regular, uniform structures, but in a few cases
(uteroglobin, for example) one of the helices is tilted away from the bundle.
Tobacco mosaic virus protein has small, highly twisted antiparallel b-sheets
on one end of the helix bundle with two additional helices on the other side
of the sheet. Notice in Figure 6.29 that most of the antiparallel a-helix
proteins are made up of four-helix bundles.
Figure 6.29 · Several examples of antiparallel a -proteins. (Jane Richardson)
The so-called globin proteins are an important group of a-helical proteins. These include hemoglobins and myoglobins from many species. The globin structure can be viewed as two layers of helices, with one of these layers perpendicular to the other and the polypeptide chain moving back and forth between the layers.
Parallel or Mixed b-Sheet Proteins
The second major class of protein structures contains structures based around parallel or mixed b-sheets. Parallel b-sheet arrays, as previously discussed, distribute hydrophobic side chains on both sides of the sheet. This means that neither side of parallel b-sheets can be exposed to solvent. Parallel b-sheets are thus typically found as core structures in proteins, with little access to solvent.
Another important parallel b-array is the eight-stranded parallel b-barrel, exemplified in the structures of triose phosphate isomerase and pyruvate kinase (Figure 6.30). Each b-strand in
Figure 6.30 · Parallel b -array proteins—the eight-stranded b -barrels of triose phosphate isomerase (a, side view, and b, top view) and (c) pyruvate kinase. (Jane Richardson) the barrel is flanked
by an antiparallel a-helix. The a-helices
thus form a larger cylinder of parallel helices concentric with the b-barrel.
Both cylinders thus formed have a right-handed twist. Another parallel b-structure
consists of an internal twisted wall of parallel or mixed b-sheet
protected on both sides by helices or other substructures. This structure is
called the doubly wound parallel b-sheet
because the structure can be imagined to have been wound by strands beginning
in the middle and going outward in opposite directions. The essence of this
structure is shown in Figure 6.31. Whereas the barrel structures have four layers
of backbone structure, the doubly wound sheet proteins have three major layers
and thus two hydrophobic core regions.

Figure 6.31 · Several typical doubly wound parallel b-sheet proteins. (Jane Richardson)
Antiparallel b-Sheet Proteins
Another important class
of tertiary protein conformations is the antiparallel b-sheet
structures. Antiparallel b-sheets, which usually
arrange hydrophobic residues on just one side of the sheet, can exist with one
side exposed to solvent. The minimal structure for an antiparallel b-sheet
protein is thus a two-layered structure, with hydrophobic faces of the two sheets
juxtaposed and the opposite faces exposed to solvent. Such domains consist of
b-sheets arranged in a cylinder or barrel shape.
These structures are usually less symmetric than the singly wound parallel barrels
and are not as efficiently hydrogen bonded, but they occur much more frequently
in nature. Barrel structures tend to be either all parallel or all antiparallel
and usually consist of even numbers of b-strands.
Good examples of antiparallel structures include soybean trypsin inhibitor,
rubredoxin, and domain 2 of papain (Figure 6.32).
Figure 6.32 · Examples of antiparallel b -sheet structures in proteins. (Jane Richardson)
Figure 6.33 · Examples of the so-called Greek key antiparallel b -barrel structure in proteins.
Antiparallel arrangements of b-strands can also form sheets as well as barrels. Glyceraldehyde-3-phosphate dehydrogenase, Streptomyces subtilisin inhibitor, and glutathione reductase are examples of single-sheet, double-layered topology (Figure 6.34).