
The Maya encoded their history in hieroglyphs
carved on stelae and temples like these ruins in
Tikal, Guatemala. (Ó George
Holton/Photo
Researchers Inc.)
Chapter 32
The Genetic Code
We
turn now to the problem of how the sequence of nucleotides in an mRNA molecule
is translated into the specific amino acid sequence of a protein. The problem
raises both informational and mechanical questions. First, what is the genetic
code that allows the information specified in a sequence of bases to be translated
into the amino acid sequence of a polypeptide? That is, how is the 4-letter
language of nucleic acids translated into the 20-letter language of proteins?
Implicit in this question is a mechanistic dilemma for structural biologists:
It is easy to see how base pairing establishes a one-to-one correspondence that
allows the template-directed synthesis of polynucleotide chains in the processes
of replication and transcription. However, there is no obvious chemi-cal affinity
between the purine and pyrimidine bases and the 20 different amino acids. Nor
is there any structural complementarity or stereochemical connection between
polynucleotides and amino acids that might guide the translation of information.
Figure 32.1 · The general structure of tRNA molecules. Circles represent nucleotides in the tRNA sequence. The numbers given indicate the standardized numbering system for tRNAs (which differ in total number of nucleotides). Dots indicate places where the number of nucleotides may vary in different tRNA species. Recall from Chapter 11 that tRNA molecules often have modified or unusual bases.
Francis Crick reasoned that adapter molecules must bridge this informational gap. These adapter molecules must interact specifically with both nucleic acids (mRNAs) and amino acids. At least 20 different adapter molecules would be needed, at least one for each amino acid. The various adapter molecules would be able to read the genetic code in an mRNA template and align the amino acids according to the template’s directions so that they could be polymerized into a unique polypeptide. Transfer RNAs (tRNAs; Figure 32.1) are the adapter molecules (Chapters 11 and 12). Amino acids are attached to the 3'-OH at the 3'-CCA end of tRNAs as aminoacyl esters. The formation of these aminoacyl-tRNAs, so-called “charged tRNAs,” is catalyzed by specific amino-acyl-tRNA synthetases. There is one of these enzymes for each of the 20 amino acids and each aminoacyl-tRNA synthetase loads its amino acid only onto tRNAs designed to carry it. In turn, these tRNAs specifically recognize unique sequences of bases in the mRNA through complementary base pairing.
Figure
32.2 · (a) An overlapping
versus a nonoverlapping code. (b) A continuous versus a punctuated code.
32.1 · Elucidating the Genetic Code
Once it was realized that the sequence of bases in a gene specified the sequence of amino acids in a protein, various possibilities for the genetic code came under consideration. How many bases were necessary to specify each amino acid? Is the code overlapping or nonoverlapping (Figure 32.2)? Is the code punctuated or continuous? Mathematical considerations favored a triplet of bases as the minimal code word, or codon, for each amino acid: A doublet code based on pairs of the four possible bases, A, C, G, and U, has 42 = 16 unique arrangements, an insufficient number to encode the 20 amino acids. A triplet code of four bases has 43 = 64 possible code words, more than enough for the task. Genetic results gave early answers to several of the other questions. For example, point mutations in the gene encoding the coat protein of TMV (tobacco mosaic virus, Chapter 29) caused single amino acid substitutions, discounting the possibility that the code was overlapping. A single base change in an overlapping code should cause multiple amino acid changes in the protein. For example, three changes would occur if the code were an overlapping triplet one (Figure 32.2).
The General Nature of the Genetic Code
The genetic code is a triplet code read continuously from a fixed starting point in each mRNA. Specifically, it is defined by the following:
1. A group of three bases codes for one amino acid.
2. The code is not overlapping.
3. The base sequence is read from a fixed starting point without punctuation. That is, the mRNA sequences contain no “commas” signifying appropriate groupings of triplets. If the reading frame is displaced by one base, it remains shifted throughout the subsequent message; no “commas” are present to restore the “correct” frame.
4. The code is degenerate, meaning that, in most cases, each amino acid can be coded by any of several triplets.
Regarding this latter point, recall that a triplet code yields 64 codons for 20 amino acids. If only 20 of these are used, then the majority of codons would be nonsense in that they would not code for any amino acid. A consequence of degeneracy is that most codons (61 of 64) code for some amino acid.
Elucidating the Genetic Code Through Biochemistry
The actual assignment of codons to the respective amino acids came from in vitro studies using synthetic oligo- and polyribonucleotides as messenger RNAs. Marshall Nirenberg and Heinrich Matthaei discovered that a cell-free system from Escherichia coli catalyzed the synthesis of polyphenylalanine (poly[Phe]) in the presence of polyuridylic acid (poly[U]). This cell-free system contained, among other things, ribosomes, tRNAs, and the soluble enzymes necessary to activate amino acids for protein synthesis. Even though the other 19 amino acids were present in the reaction mixture, only phenylalanine was incorporated into protein when poly[U] served as mRNA. The first codon had been deciphered: UUU codes for Phe. Similar experiments with polyadenylic acid (poly[A]) and polycytidylic acid (poly[C]) yielded polylysine and polyproline, respectively, showing that AAA codes for Lys and CCC codes for Pro.1
Trinucleotides Bound to Ribosomes Promote the Binding of Specific Aminoacyl-tRNAs
Figure
32.3 · The filter-binding
assay for elucidation of the genetic code. Reaction mixture includes washed
ribosomes, Mg2 1, a particular trinucleotide (pUpUpU in this example),
and all 20 aminoacyl-tRNAs, one of which is radioactively (14C) labeled.
(a) 14C-labeled prolyl-tRNA. (b) 14C-labeled phenylalanyl-tRNA.
Only the aminoacyl-tRNA whose binding is directed by the trinucleotide codon
will become bound to the ribosomes and retained on the nitrocellulose filter.
The amount of radioactivity retained by the filter is a measure of trinucleotide-directed
binding of a particular labeled aminoacyl-tRNA by ribosomes. Use of this binding
assay to test the 64 possible codon trinucleotides against the 20 different
amino acids quickly enabled researchers to assign triplet code words to the
individual amino acids. The genetic code was broken. (Adapted from Nirenberg,
M. W., and Leder, P., 1964. RNA codewords and protein synthesis. Science
145:1399 - 1407)
In 1964, Marshall Nirenberg
and Philip Leder reported that trinucleotides bound to ribosomes directed the
binding of specific aminoacyl-tRNAs. That is, ternary ribosome:trinucleotide:aminoacyl-tRNA
complexes could be formed, provided the right trinucleotide and aminoacyl-tRNA
combination was present. Aminoacyl-tRNAs were prepared by adding all 20 amino
acids to a purified tRNA mixture in the presence of a soluble E. coli
fraction containing the necessary aminoacyl-tRNA synthetases. Only one of the
amino acids was 14C-labeled in any one binding assay. Trinucleotides
are the equivalent of codons, so if a specific trinucleotide promoted the binding
of a particular 14C-labeled amino-acyl-tRNA, the base sequence of
the trinucleotide must be the code word for that amino acid. Binding was detected
because the ribosomes were retained on a nitrocellulose filter while free aminoacyl-tRNAs
passed through; only aminoacyl-tRNAs bound by ribosomes were retained (Figure
32.3).
This
system was quickly exploited to elucidate the genetic code. Elucidation of the
genetic code was probably the greatest scientific achievement of the 1960s.
For their roles in it, Marshall Nirenberg and H. Gobind Khorana shared in the
1968 Nobel Prize for physiology or medicine.
1Because polyguanylic acid (poly[G]) has a very strong tendency to form multistranded helices, it was a poor template for protein synthesis. The fact that GGG codes for Gly was not learned until later.
32.2 · The Nature of the Genetic Code
The complete translation of the genetic code is presented in Table 32.1. Codons, like other nucleotide sequences, are read 5' ® 3'. Codons represent triplets of bases in mRNA or, replacing U with T, triplets along the nontranscribed (nontemplate) strand of DNA. Several noteworthy features characterize the genetic code:
1. All the codons have meaning. Sixty-one of the 64 codons specify particular amino acids. The remaining 3—UAA, UAG, and UGA—specify no amino acid and thus they are nonsense codons. Nonsense codons serve as termination codons —they are “stop” signals indicating that the end of the protein has been reached.
2. The genetic code is unambiguous. Each of the 61 “sense” codons encodes only one amino acid.
3. The genetic code is degenerate. With the exception of Met and Trp, every amino acid is coded by more than one codon. Several—Arg, Leu, and Ser—are represented by 6 different codons. Codons coding for the same amino acid are called synonymous codons.
4. Codons representing the same amino acid or chemically similar amino acids tend to be similar in sequence. Often the third base in a codon is irrelevant, so that, for example, all 4 codons in the GGX family specify Gly, and the UCX family specifies Ser (Table 32.1). This feature is known as third-base degeneracy. Note also that codons with a pyrimidine as second base likely encode amino acids with hydrophobic side chains, and codons with a purine in the second-base position typically specify polar or charged amino acids. The two negatively charged amino acids, Asp and Glu, are encoded by GAX codons; GA-pyrimidine gives Asp and GA-purine specifies Glu. The consequence of these similarities is that mutations are less likely to be deleterious because single base changes in a codon will result either in no change or in a substitution with an amino acid similar to the original amino acid. The degeneracy of the code is evolution’s buffer against mutational disruption.
5. The genetic code is “universal.” Although certain minor exceptions in codon usage occur (see A Deeper Look: Natural Variations in the Standard Genetic Code), the more striking feature of the code is its universality: Codon assignments are virtually the same throughout all organisms—archaea, eubacteria, and eukaryotes. This conformity means that all extant organisms use the same genetic code, providing strong evidence that they all evolved from a common primordial ancestor.
| A DEEPER LOOK | |
| Natural Variations in the Standard Genetic Code | |
| The genomes
of mitochondria, some prokaryotes, and lower eukaryotes show some exceptions
to the standard genetic code (Table 32.1) in codon assignments. The phenomenon
is more common in mitochondria. For example, the termination codon UGA codes
for tryptophan in mitochondria from various animals, protozoans, and fungi.
AUA, normally an Ile codon, codes for methionine in some animal and fungal
mitochondrial genomes, and AGA (an Arg codon) is a termination codon in
vertebrate mitochondria, but a Ser codon in fruit fly mitochondria. Mitochondria
in several species of yeast use the CUX codons to specify Thr instead of
Leu. Higher plant mitochondria use CGG, normally an Arg codon, to specify
Trp. Less common are genomic codon variations within the genomes of prokaryotic and eukaryotic cells. Among the lower eukaryotes, certain ciliated protozoans (Tetrahymena and Paramecium) use UAA and UGA as glutamine codons rather than stop codons. Instances in prokaryotes include use of the stop codon UGA to specify Trp by Mycoplasma. Perhaps most interesting is the use of some UGA codons by both prokaryotes and eukaryotes (including humans) to specify selenocysteine, |
an analog of cysteine
in which the sulfur atom is replaced by a selenium atom. Indeed, the identification
of selenocysteine residues in proteins from bacteria, archaea, and eukaryotes
has led some people to nominate selenocysteine as the 21st amino acid!
Selenocysteine formation requires a novel selenocysteine-specific tRNA
known as tRNASec. This tRNASec is loaded with a
Ser residue by seryl-tRNA synthetase, the aminoacyl-tRNA synthetase for
serine. Then, in an ATP-dependent process, the Ser-O is replaced by Se.
Translation of certain selected UGA codons by selenocysteinyl-tRNASec
requires additional proteins and formation of a stable stem-loop secondary
structure in the mRNA next to the particular UGA codon.
|
| Adapted from Fox, T. D., 1987. Natural variation in the genetic code. Annual Review of Genetics 21:67-91; and Low, S. C., and Berry , M. J., 1996. Knowing when not to stop. Trends in Biochemical Sciences 21:203-208. | |
32.3 · The Second Genetic Code: Aminoacyl-tRNA Synthetase Recognition of the Proper Substrates
Figure
32.4 · Reduction of cysteinyl-tRNACys
with Raney nickel converts the cysteine -CH2SH R group to -CH3.
That is, Cys is transformed into Ala.
Codon recognition is achieved
by aminoacyl-tRNAs. In order for accurate translation to occur, the appropriate
aminoacyl-tRNA must “read” the codon through base pairing via its anticodon
loop (Chapter 12). Once an aminoacyl-tRNA
has been synthesized, the amino acid part makes no contribution to accurate
translation of the mRNA. Von Ehrenstein proved this point by loading 14C-cysteine
onto its particular tRNA, tRNACys. The product cysteinyl-tRNACys
was chemically reduced so that its -SH group was removed to yield Ala-tRNACys
(Figure 32.4). The Ala-tRNACys was then added to an in vitro
hemoglobin-synthesizing system and the product Hb was analyzed. Alanine was
found at positions in the Hb amino acid sequence normally occupied by Cys. The
protein-synthesizing machinery was unable to recognize Ala-tRNACys
as “foreign” or inappropriate. Thus, the amino acid attached to a tRNA vehicle
is passively chauffeured and becomes inserted into a growing peptide chain as
dictated through codon-anticodon recognition.
Thus,
a second genetic code exists, the code by which each aminoacyl-tRNA synthetase
discriminates between the 20 amino acids and the many tRNAs and uniquely picks
out its proper substrates—one specific amino acid and the tRNA(s) appropriate
to it—from among the more than 400 possible combinations. The appropriate tRNA(s)
are those having anticodons that can base-pair with the codon(s) specifying
the particular amino acid. Clearly, the proper amino acids must be loaded onto
the various tRNAs so that the mRNA is translated with fidelity. Although the
primary genetic code is key to understanding the central dogma of molecular
biology on how DNA encodes proteins, the second genetic code is just as crucial
to the fidelity of information transfer.
| cognate · kindred; in this sense, cognate refers to those tRNAs having anticodons that can read one or more of the codons that specify one particular amino acid. |
Cells have 20 different aminoacyl-tRNA synthetases, one for each amino acid. Each of these enzymes catalyzes the ATP-dependent esterification of its specific amino acid substrate to the 3'-end of its cognate tRNA molecules (Figure 32.5). The aminoacyl-tRNA synthetase reaction serves two purposes:
1. It activates the amino acid so that it will readily react to form a peptide bond.
2. It bridges the information gap between amino acids and codons.
‑‑‑The underlying mechanisms of molecular recognition used by each amino-acyl-tRNA synthetase to bring the proper amino acid to its cognate tRNA are the embodiment of the second genetic code.
Figure
32.5 · The aminoacyl-tRNA
synthetase reaction. (a) The overall reaction. (b) The overall reaction commonly
proceeds in two steps: (i) formation of an aminoacyl-adenylate, and (ii) transfer
of the activated amino acid moiety of the mixed anhydride ' to either the 29-OH
(class I aminoacyl-tRNA synthetases) or 39-OH (class II aminoacyl-tRNA synthetases)
of the ribose on the terminal adenylic acid at the 39-CCA terminus common to
all tRNAs. Those aminoacyl-tRNAs formed as 29-aminoacyl esters undergo a transesterification
that moves the aminoacyl function to the 39-O of tRNA. Only the 39-esters are
substrates for protein synthesis.
The Aminoacyl-tRNA Synthetase Reaction
Amino acid activation is a two-step process:
1. Activation of the amino acid through the ATP-dependent formation of an aminoacyl adenylate. Ever-present pyrophosphatases in cells quickly hydrolyze the pyrophosphate product of this reaction, rendering amino acid activation thermodynamically favorable and essentially irreversible.
2. Transfer of the aminoacyl group from the aminoacyl adenylate to a specific tRNA. Aminoacyl-tRNA synthetases that must discriminate between similar amino acids (such as Ile and Val) show two levels of specificity in the two-step aminoacyl-tRNA synthetase reaction. The specificity at the first step is not absolute, as shown by the ability of isoleucyl-tRNA synthetase to catalyze an ATP 3 4PPi exchange reaction with either isoleucine or valine, that is, reaction (i) of Figure 32.5. Although valyl adenylate is synthesized, no valyl-tRNAIle is released. That is, the overall specificity of the isoleucyl-tRNA synthetase reaction is virtually absolute. The enzyme has an editing function that establishes this specificity: Synthesis of misacylated valyl-tRNAIle triggers an editing deacylase site in the enzyme that hydrolyzes the misacylated aminoacyl-tRNA.
The Two Classes of Aminoacyl-tRNA Synthetases
| Table 32.2 | |
| The Two Classes of Aminoacyl-tRNA Synthetases | |
|
Class
I
|
Class
II
|
|
Arg
|
Ala
|
|
Cys
|
Asn
|
|
Gln
|
Asp
|
|
Glu
|
Gly
|
|
Ile |
His
|
|
Leu
|
Lys
|
|
Met |
Phe
|
|
Trp
|
Pro
|
|
Tyr |
Ser
|
|
Val
|
Thr
|
Despite their common enzymatic
function, aminoacyl-tRNA synthetases are a diverse group of proteins in terms
of size, amino acid sequence, and oligomeric structure. The subunits show a
broad range of sizes (for example, from 334 to 951 amino acid residues in the
E. coli enzymes), and four different patterns of subunit organization
are seen—a, a2,
a4, and a2
b2. In higher eukaryotes, at least
some aminoacyl-tRNA synthetases assemble into large multiprotein complexes.
The aminoacyl-tRNA synthetases fall into two fundamental classes on the basis
of similar amino acid sequence motifs, oligomeric state, and acylation function
(Table 32.2). Class I enzymes are chiefly monomeric (a1),
whereas class II aminoacyl-tRNA synthetases are always oligomeric (usually homodimers).
Furthermore, class I aminoacyl-tRNA synthetases first add the amino acid to
the 2'-OH of the terminal adenylate residue of tRNA before shifting it to the
3'-OH; class II enzymes add it directly to the 3'-OH (Figure 32.5). Only the
3'-aminoacyl-tRNA esters are substrates for protein synthesis.
Figure
32.6 · Mirror-symmetric
interactions of class I versus class II aminoacyl-tRNA synthetases with their
tRNA substrates. The two different classes of aminoacyl-tRNA synthetases bind
to opposite faces of tRNA molecules. On the left is a space-filling model of
the class I glutaminyl-tRNAGln synthetase. Class I synthetases bind
to the side of their tRNA substrates shown as closest in this figure (the model
tRNA structure is tRNAPhe for purposes of illustration). On the right
is a space-filling model of the class II aspartyl-tRNAAsp synthetase;
this class of synthetase binds to the side of tRNA closest to it here. (Adapted
from Arnez, J. G., and Moras, D., 1997. Structural and functional considerations
of the aminoacylation reaction. Trends in Biochemical Sciences 22:211-
216, Figure 5)
Class I aminoacyl-tRNA
synthetases have active site structures based on a parallel b-sheet
nucleotide-binding fold (named the Rossman fold, after its discoverer)
and two conserved sequence motifs (HIGH and KMSKS) that complete the ATP-binding
site. In contrast, class II aminoacyl-tRNA synthetases share a set of conserved
sequence motifs (motifs 1, 2, and 3). Motif 1 forms part of the dimerization
motif, and motifs 2 and 3 contribute essential residues to the active site.
These results suggest that the catalytic domains of these enzymes evolved from
two different ancestral predecessors. Apparently, aminoacyl-tRNA synthetases
are ranked among the oldest proteins because different forms of these enzymes
were present very early in evolution. X-ray crystallographic structures have
been solved for a majority of the 20 aminoacyl-tRNA synthetases. Class I and
class II aminoacyl-tRNA synthetases interact with the tRNA 3'-terminal CCA and
acceptor stem in a mirror-symmetric fashion with respect to each other (Figure
32.6). Class I enzymes bind to the tRNA acceptor stem helix from the minor-groove
side, whereas class II enzymes bind it from the major-groove side.
Figure
32.7 · (a) E. coli
glutamyl-tRNA synthetase, a class I enzyme. (b) Thermus thermophilus
glycyl-tRNA synthetase, a class II enzyme.(Adapted from Cusack, S., 1995.
Eleven down and nine to go. Nature Structural Biology 2:824-831,
Figures 2 and 5)
Both class I and class II aminoacyl-tRNA synthetases can be approximated as two-domain structures, as can their L-shaped tRNA substrates, which have the acceptor stem/CCA-3'-OH at one end and the anticodon stem-loop at the other (see Figures 12.38 and 32.8). This L-shaped tertiary structure of tRNAs separates the 3'-CCA acceptor end from the anticodon loop by a distance of 7.6 nm. The two domains of tRNAs have distinct functions: The 3'-CCA end is the site of aminoacylation, and the anticodon-containing domain interacts with the mRNA template. The two domains of tRNAs interact with the separate domains in the synthetases. One of the two major aminoacyl-tRNA synthetase domains is the catalytic domain (which defines the difference between class I and class II enzymes); this domain interacts with the tRNA 3'-CCA end. The other major domain in aminoacyl-tRNA synthetases is highly variable and interacts with parts of the tRNA beyond the acceptor-TyC stem-loop domain, including, in some cases, the anticodon. Ribbon structures of representative class I and class II aminoacyl-tRNA synthetases are shown in Figure 32.7.
Selective tRNA Recognition by Aminoacyl-tRNA Synthetases
Aside from the need to
uniquely recognize their cognate amino acids, amino-acyl-tRNAsynthetases must
be able to discriminate between the varioustRNAs. The structural features that
permit the synthetases to recognize and amino-acylatetheir cognate tRNA(s) are
not universal. That is, a
Figure 32.8 · Ribbon diagram of tRNA tertiary structure. Numbers represent the consensus nucleotide sequence (see Figure 32.1). The locations of nucleotides recognized by the various aminoacyl-tRNA synthetases are indicated; shown within the boxes are one-letter designations of the amino acids whose respective aminoacyl-tRNA synthetases interact at the discriminator base (position 73), acceptor stem, variable pocket and/or loop, or anticodon. The inset shows additional recognition sites in those tRNAs having a variable loop that forms a stem-loop structure. (Adapted from Saks, M. E., Sampson, J. R., and Abelson, J. N., 1994. The transfer RNA problem: A search for rules. Science 263:191 - 197, Figure 2)
common set of rules does
not govern tRNA recognition by these enzymes. Most surprising is the fact that
the recognition features are not limited to the anticodon and, in some instances,
do not even include the anticodon. For most tRNAs, a set of sequence elements
is recognized by its specific aminoacyl-tRNA synthetase, rather than a single
distinctive nucleotide or base pair. These elements include one or more of the
following: (a) at least one base in the anticodon; (b) one or more of the three
base pairs in the acceptor stem; and (c) the base at canonical position 73 (the
unpaired base preceding the
CCA
end), referred to as the discriminator base because this base is invariant
in the tRNAs for a particular amino acid. Figure 32.8 presents a ribbon diagram
of a tRNA molecule showing the common location of nucleotides that contribute
to specific recognition by the respective aminoacyl-tRNA synthetases for each
of the 20 amino acids. Interestingly, the same set of tRNA features that serves
as positive determinants for binding and aminoacylation of the tRNA by its cognate
aminoacyl-tRNA synthetase may act as negative determinants that prohibit binding
and amino-acylation by other (noncognate) aminoacyl-tRNA synthetases. Because
no common set of rules exists, the second genetic code is an operational
code based on aminoacyl-tRNA synthetase recognition of varying sequence
and structural features in the different tRNA molecules during the operation
of aminoacyl-tRNA synthesis. Some examples of this code are given in Figure
32.9 and the discussion below.
Figure 32.9 · Major identity elements in four tRNA species. Each base in the tRNA is represented by a circle. Numbered filled circles indicate positions of identity elements within the tRNA that are recognized by its specific aminoacyl-tRNA synthetase. (Adapted from Schulman, L. H., and Abelson, J., 1988. Recent excitement in understanding transfer RNA identity. Science 240:1591- 1592)
tRNA Recognition Sites in E. coli Glutaminyl-tRNAGln Synthetase

Figure
32.10 · (a) A solvent-accessible
representation of E. coli glutaminyl-tRNAGln synthetase complexed
with tRNAGln and ATP, derived from analysis of the crystal structure
of the complex. The protein is colored blue. The sugar-phosphate backbone of
the tRNA is red; its bases are yellow. The protein:tRNA contact region extends
along one side of the entire length of this extended protein. The acceptor stem
of the tRNA and the ATP (green) fit into a cleft at the top of the protein
in this view. The enzyme also interacts extensively with the anticodon (lower
tip of tRNAGln). (b) Diagram showing the structure of tRNAGln,
as represented by its phosphorus atoms (purple spheres), in complex with
E. coli glutaminyl-tRNAGln synthetase, as represented in the
terms of its Ca atoms (blue). ((a) adapted
from Rould, M. A., et al., 1989. Structure of E. coli glutaminyl-tRNA synthetase
complexed with tRNAGln and ATP at 2.8 Å resolution. Science
246:1135; photo courtesy of Thomas A. Steitz of Yale University)
E. coli glutaminyl-tRNAGln synthetase is a 63.4-kD (553-residue) class I mono-meric enzyme. The crystal structure of glutaminyl-tRNAGln synthetase complexed with tRNAGln reveals that the enzyme shares a continuous interaction with its cognate tRNA that extends from the anticodon to the acceptor stem along the entire inside of the L-shaped tRNA (Figure 32.10). Specific recognition elements include enzyme contacts with the discriminator base, acceptor stem, and anticodon, particularly the central U in the CUG anticodon. The carboxylate group of Asp235 makes sequence-specific H bonds in the tRNA minor groove with the 2-NH2 group of G3 in the base pair G3:C70 of the acceptor stem. A mutant glutaminyl-tRNAGln synthetase with Asn substituted for Asp at position 235 shows relaxed specificity; that is, it now incorrectly acylates a noncognate tRNA with Gln.
The Identity Elements Recognized by Some Aminoacyl-tRNA Synthetases Reside in the AnticodonAlteration of the anticodons of either tRNATrp or tRNAVal to CAU, the anticodon for the methionine codon AUG, transforms each of these tRNAs into tRNAMet. That is, these tRNAs are now recognized and charged with methionine by methionyl-tRNA synthetase. Similarly, reversing the methionine CAU anticodon of tRNAMet to UAC transforms this tRNA into a tRNAVal. Clearly, methionyl-tRNA synthetase and valyl-tRNA synthetase rely on the anticodon for crucial identity elements.
Five Different Bases in Yeast tRNAPhe Serve as Its Identity ElementsThe structure of yeast tRNAPhe is known in great detail (Figure 32.9; see also Figures 12.37 and 12.38). Five of its bases serve as identity elements for the yeast phenylalanyl-tRNAPhe synthetase: the three bases of the anticodon, residue G20 in the D loop, and A73 near the 3'-end. When yeast tRNAArg, tRNAMet, and tRNATyr were altered so that they each contained a complete set of the five identity elements (namely, G20, G34, A35, A36, and A73), they became excellent substrates of yeast phenylalanyl-tRNAPhe synthetase. The G20 of tRNAPhe may be an important discriminatory nucleotide because a G has not been found at this position in any other yeast tRNA.
Twelve Nucleotides in Common Define the tRNASer FamilySix codons specify serine. Six distinct isoacceptor tRNAs can be aminoacylated by the E. coli seryl-tRNASer synthetase. Five of these tRNAs are the product of E. coli genes; the sixth is encoded by the phage-T4 genome. These 6 tRNAs have anticodons that include UGA, CGA, GGA, and GCU; thus, variations occur at all 3 anticodon base positions. The nucleotide sequences of these 6 tRNAs have been compared and only 12 positions are held in common. These nucleotides include G1, G2, A3 (or U3) and U70 (or A70), C71, C72, and G73 in the acceptor stem, and C11 and G24 in the dihydrouridine (D) stem. All of these nucleotides except G73 are involved in intrachain H bonds (Figure 32.9). When a leucine-specific tRNA was modified so that it shared all 12 tRNASer identities, it was transformed into a tRNASer.
A Single G:U Base Pair Defines tRNAAlas
Figure
32.11 · A microhelix
analog of tRNAAla is aminoacylated by alanyl-tRNAAla synthetase,
provided it retains the G3:U70 base pair. Substituting C for U at position 70
abolishes its ability to accept Ala. The sequences of tRNAAla/GGC
and its microhelix analog are shown. MicrohelixAla consists only
of nucleotides 1 through 13 directly connected to 66 through 76 to re-create
the tRNAAla 7-bp acceptor stem. (Adapted from Schimmel, P., 1989.
Parameters for the molecular recognition of transfer RNAs. Biochemistry
28:2747- 2759)
In contrast, a single,
noncanonical base pair, G3:U70, is the principal element by which alanyl-tRNAAla
synthetase recognizes tRNAs as its substrates. All cytoplasmic tRNAAla
representatives that have been sequenced thus far, from archaebacteria to eukaryotes,
possess this G3:U70. Altering the G3:C70 base pairs found in Lys-specific, Cys-specific,
and Phe-specific tRNAs to G3:U70 confers alanine acceptability on these tRNAs.
Altering the unusual G3:U70 base pair of tRNAAla to G:C, A:U, or
even U:G abolishes its ability to be aminoacylated with Ala. On the other hand,
provided the G3:U70 base pair is present, alanyl-tRNAAla synthetase
aminoacylates a 24-nucleotide “microhelix” analog of tRNAAla (Figure
32.11).
Paul
Schimmel has deduced that the key structural feature of the G3:U70 determinant
is the 2-NH2 group of G3. A series of analogs was prepared in which
other base pairs replaced the “G3:U70” base pair
in an analog of tRNAAla (Figure 32.12). Only the original G3:U70
analog was aminoacylated with Ala by Ala-tRNA synthetase. Note that in RNA A-form
double-helical structures such as tRNAs, the G3 2-amino group is exposed in
the minor groove of the helix (Figure 32.12). If G3 base-pairs with U at position
70, this -NH2 is not H-bonded. In the G:C and 2-AP:U analogs, this
-NH2 is not free, but hydrogen-bonded with C or U; the I:U and A:U
analogs lack a 2-NH2. The inverse U3:G70 analog (not shown) places
the 2-NH2 in the major groove. Paul Schimmel and his colleagues thus
concluded that an unpaired guanine 2-amino group at the proper location within
the minor groove earmarks a tRNA for aminoacylation by Ala-tRNA synthetase.
Because this structural feature is common to all cytoplasmic tRNAAlas,
the various tRNA recognition elements must have been decided very early in evolution.
Figure 32.12 · Structures of various base pairs in relationship to the 3:70 position in tRNA molecules. The double-helical regions of transfer RNAs adopt the A-form double-helical conformation of nucleic acids (Chapter 12), which has a deep, narrow major groove on one side and a shallow, wide minor groove on the other.
32.4 · Codon-Anticodon Pairing, Third-Base Degeneracy, and the Wobble Hypothesis
Figure
32.13 · Codon-anticodon
pairing. Complementary trinucleotide sequence elements align in antiparallel
fashion.
Protein synthesis depends on the codon-directed binding of the proper amino-acyl-tRNAs so that the right amino acids are sequentially aligned according to the specifications of the mRNA undergoing translation. This alignment is achieved via codon-anticodon pairing in antiparallel orientation (Figure 32.13). However, considerable degeneracy exists in the genetic code at the third position. Conceivably, this degeneracy could be handled in either of two ways: (a) codon-anticodon recognition could be highly specific so that a complementary anticodon is required for each codon, or (b) fewer than 61 anticodons could be used for the “sense” codons if certain allowances were made in the base-pairing rules. Then, some anticodons could recognize more than one codon. Nirenberg demonstrated as early as 1965 that poly (U) bound all the Phe-tRNAPhe even though UUC is also a Phe codon. This result suggested that the phenylalanine-specific tRNAs could recognize both UUU and UUC. The yeast tRNAAla isolated by Robert Holley in 1965 (Chapter 12) bound to three codons: GCU, GCC, and GCA.
The
Wobble Hypothesis
Figure 32.14 · Various base-pairing alternatives. (a) G:A is unlikely because the 2-NH2 of G cannot form one of its H bonds; even water is sterically excluded. U:C may be possible even though the two C=O are juxtaposed. Two U:U arrangements are feasible. G:U and I:U are both possible and somewhat similar. The purine pair I:A is also possible. (b) The relative positions of the glycosidic C1 9 atoms in various base-pairing alternatives. The positional variation seen for the codon C19 carbon atom is a measure of wobble. The U:C, C:U, and either of the two possible U:U base pairs bring the respective glycosidic C19 atoms closer than the standard position; C19 atoms in I:U, G:U, and U:G pairs are spaced similar to the standard; the I:A pair moves them farther apart. (Adapted from Crick, F. H. C., 1966. Codon-anticodon pairing: The wobble hypothesis. Journal of Molecular Biology 19:548- 555)
Francis Crick considered these results and tested alternative base-pairing possibilities by model building. He hypothesized that the first two bases of the codon and the last two bases of the anticodon form canonical Watson-Crick A:U or G:C base pairs, but pairing between the third base of the codon and the first base of the anticodon follows less stringent rules. That is, a certain amount of play, or wobble, might occur in base pairing at this position. The first base of the anticodon is sometimes referred to as the wobble position. Crick examined the steric consequences of various noncanonical base pairs. The purine inosine was included because it was known to be a component of tRNAs. In some pairs, the bases were rather close together, as revealed by the relative positions of their respective C1' atoms (Figure 32.14). In Figure 32.14b, the C1' of the first nucleotide in the anticodon is taken as fixed and the relative position of the corresponding codon third-nucleotide C1' is shown. The genetic code must often distinguish between pyrimidines (U or C) versus purines (A or G) in the third position (as in the codons for Phe versus Leu or His versus Gln). Therefore, pairing possibilities that bring the two C1' atoms close to one another (as in the 2 U ··· U possibilities and the U ··· C/C ··· U possibility in Figure 32.14b) must not be tolerated. Otherwise, anticodon U would not specifically interact with either A or G but instead would indiscriminately read any base in the third position of the codon to contribute either a U:U, U:C, U:A, or U:G pairing to the anticodon-codon interaction.
| Table 32.3 | |
| Base-Pairing Possibilities at the Third Position of the Codon | |
|
Base
on the Anticodon
|
Bases
Recognized
on the Codon |
|
U
|
A, G |
|
C
|
G |
|
A
|
U |
|
G
|
U, C |
|
I
|
U, C, A |
| Source: Adapted from Crick, F. H. C., 1966. Codon-anticodon pairing: The wobble hypothesis. Journal of Molecular Biology 19:548-555. | |
This constraint leads
to a set of rules for pairing between the third base of the codon and the first
base of the anticodon (Table 32.3). The wobble rules indicate that a first-base
anticodon U could recognize either an A or G in the codon third-base position;
first-base anticodon G might recognize either U or C in the third-base position
of the codon; and first-base anticodon I might interact with U, C, or A in the
codon third position.2
Note
that inosine is a versatile base in establishing degeneracy. (Inosine arises
in tRNAs from specific A residues that undergo deamination.) Yeast tRNAAla
(Figure 12.36) has I in the wobble position. The wobble rules also predict that
four-codon families (like Pro or Thr), where any of the four bases may be in
the third position, require at least two different tRNAs. Such four-codon families
could be read by two tRNAs whose recognition patterns are either UC and AG or
are UCA and G.
The only codons
for a given amino acid that differ in either of the first two bases are the
6-codon families for Leu, Ser, and Arg; these amino acids require at least three
different tRNAs. Altogether a minimum of 31 tRNAs are necessary to interpret
the 61 sense codons. However, most cells have more than 32 different tRNA species.
We saw that E. coli has 5 distinct tRNAs for the 6 Ser codons. Some tRNAs
have the same anticodon but differ in their nucleotide sequences. For example,
there are two distinct tRNATyr species in E. coli, both having
a GUA anticodon capable of reading the UAU and UAC codons for Tyr. All members
of the set of tRNAs specific for a particular amino acid—termed isoacceptor
tRNAs—are served by one aminoacyl-tRNA synthetase.
2Thus, the first base of the anticodon indicates whether the tRNA can read one, two, or three different codons: anticodons beginning with A or C read only one codon, those beginning with G or U read two, while anticodons beginning with I can read three codons.
The Purpose of Wobble
The first two bases of the codon confer most of the codon-anticodon specificity. The wobble position also contributes to codon recognition and specificity, but hydrogen bonds between noncanonical base pairs are weaker, and thus the pairing here is “looser.” Wobbling is possible because the 5'-side of the anticodon is situated in a conformationally flexible part of the tRNA anti-codon loop. There is a kinetic advantage to wobble: If all three base pairs of the codon-anticodon complex were of the strong Watson - Crick type, codon - anticodon associations would be more stable and the tRNAs would dissociate less readily from the mRNA, slowing the rate of protein synthesis. However, because the wobble position makes only a marginal contribution to codon - anticodon interaction, wobble tends to accelerate the process of translation.
Because more than one
codon exists for most amino acids, the possibility for variation in codon usage
arises. Indeed, variation in codon usage accommodates the fact that the DNA
of different organisms varies in relative A:T/G:C content. However, even in
organisms of average base composition, codon usage may be biased. Table 32.4
gives some examples from E. coli and humans reflecting the nonrandom usage of
codons. Of over 109,000 Leu codons tabulated in human genes, CUG was used over
48,000 times, CUC over 23,000 times, but UUA just 6,000 times.
The
occurrence of codons in E. coli mRNAs correlates well with the relative
abundance of the tRNAs that read them. Preferred codons are represented by
the most abundant isoacceptor tRNAs. Further, mRNAs for proteins that are
synthesized in abundance tend to employ preferred codons. Rare tRNAs correspond
to rarely used codons, and messages containing such codons might experience
delays in translation.
Mutations that alter a
sense codon to one of the three nonsense codons—UAA, UAG, or UGA—result in premature
termination of protein synthesis and the release of truncated (incomplete) polypeptides.
Geneticists found that second mutations elsewhere in the genome were able to
suppress the effects of nonsense mutations so that the organism survived.
The molecular basis for such intergenic suppression was a mystery until it was
realized that suppressors were mutations in tRNA genes that altered the
anticodon so that the mutant tRNA could now read a particular “stop” codon and
insert an amino acid. For example, alteration of the anticodon of a tRNATyr
from GUA to CUA allows this tRNA to read the so-called amber stop codon,
UAG, and insert Tyr. (The nonsense codons are whimsically named amber
[UAG], ochre [UAA], and opal [UGA]). Suppressor tRNAs are
typically generated from minor tRNA species within a set of isoacceptor tRNAs,
so their recruitment to a new role via mutation does not involve loss of an
essential tRNA; that is, the mutation is not particularly deleterious to the
organism. Several different suppressor tRNAs for each of the stop codons have
been characterized in E. coli.
An interesting
amber suppressor mutation results when the anticodon of a tRNATrp
is altered from CAA to CUA. Surprisingly, this suppressor tRNA inserts glutamine,
not tryptophan. Thus, suppressor tRNAs don’t necessarily carry the same amino
acid as the wild-type tRNA. (Cells carrying this suppressor must also have a
wild-type copy of tRNATrp to survive.) This suppressor tRNA is no
longer a good substrate for tryptophanyl-tRNATrp synthetase, which
evidently selects its tRNAs via anticodon recognition. Instead, glutaminyl-tRNA
synthetase charges it with Gln. Apparently, glutaminyl-tRNA synthetase has a
relaxed specificity for this tRNA substrate. A single base change has influenced
both codon-anticodon recognition and the interaction of the tRNA with aminoacyl-tRNA
synthetases.
| Table 32.4 | |||
| Representative Examples of Codon Usage in E. coli and Human Genes | |||
| The results are expressed as frequency of occurrence of a codon per 1000 codons tabulated in 1562 E. coli genes and 2681 human genes, respectively. (Because E. coli and human proteins differ somewhat in amino acid composition, the frequencies for a particular amino acid do not correspond exactly between the two species.) | |||
|
Amino Acid |
Codon |
E.
coli Gene
Frequency/1000 |
Human
Gene
Frequency/1000 |
| Leu | CUA CUC CUG CUU UUA UUG |
3.2 9.9 54.6 10.2 10.9 11.5 |
6.1 20.1 42.1 10.8 5.4 11.1 |
| Pro | CCA CCC CCG CCU |
8.2 4.3 23.8 6.6 |
15.4 20.6 6.8 16.1 |
| Ala |
GCA |
15.6 34.4 32.9 13.4 |
14.4 29.7 7.2 18.9 |
| Lys | AAA AAG |
36.5 12.0 |
21.9 35.2 |
| Glu | GAA GAG |
43.5 19.2 |
26.4 41.6 |
| Adapted from Wada, K., et al., 1992. Codon usage tabulated from GenBank genetic sequence data. Nucleic Acids Research 20:2111-2118. | |||