Chapter 29

DNA: Genetic Information, Recombination, and Mutation
Shakespeare's quote speaks to the ephemeral nature of life. Inheritance, throughout life's long history a matter of chance, now can be manipulated by scientists. Dolly, the first mammal to be cloned from an adult cell, is shown here with her naturally conceived first lamb, Bonnie. Hello, Dolly! (AP/Wide World Photos)
The fact that DNA is the material of heredity is common knowledge today, even though no one could have successfully defended such a proposition before the last half of the twentieth century. Heredity, which we can define generally as the tendency of an organism to possess the characteristics of its parent(s), was clearly evident throughout nature and since the dawn of history had served to justify the classification of organisms according to shared similarities. The molecular basis of heredity, however, was not obvious. Early geneticists demonstrated that genes, the elements or units carrying and transferring inherited characteristics from parent to offspring, are contained within the nuclei of cells in association with the chromosomes. Yet the chemical identity of genes remained unknown, and genetics was an abstract science. Even the realization that chromosomes are composed of proteins and nucleic acids did little to define the molecular nature of the gene because at the time no one understood either of these substances.
29.1 × The Discovery That DNA Carries Genetic Information
The material of heredity should have certain properties:
1. It must be very stable so that genetic information can be stored in it and transmitted countless times to subsequent generations.
2. It must be capable of precise copying or replication so that its information is not lost or altered. 3. Although stable, it must also be subject to change in order to account, in the short term, for the appearance of mutant forms and, in the long term, for evolution. The first evidence
that deoxyribonucleic acid, or DNA, might be the material of heredity
came from investigations on Streptococcus pneumoniae, one of the types
of bacteria that cause pneumonia. In 1928, Frederick Griffith, an English microbiologist,
was comparing the properties of two strains of pneumococcus bacteria. One strain,
Type S (S for smooth colonial morphology), is virulent because it is
enclosed within a slippery polysaccharide coat, or capsule, that protects it
from the immune system of its host. The other strain, Type R (R for rough-looking
colonies), lacks an enzyme for the biosynthesis of the polysaccharide coat and
is not virulent because it cannot resist attack by the host's immune system.
When Griffith injected Type S bacteria into mice, the blood became filled with
S bacteria and the mice died. Heat-killed Type S bacteria had no effect on the
mice, but if mice were injected with nonvirulent Type R bacteria that had been
mixed with heat-killed Type S bacteria, the mice died and virulent Type S bacteria
could be recovered from their blood. Somehow, the heat-killed Type S bacteria
had transformed the nonvirulent R Type into the virulent S Type (Figure 29.1).
In 1931, M.H. Dawson and R. H. P. Sia showed that extracts of heat-killed Type
S cells could transform nonpathogenic R cells into genetically stable, pathogenic
S cells.
Figure 29.1 × Griffith experiment on pneumococcal transformation: (1) Mice are resistant to Type R Streptococcus pneumoniae bacteria, but (b) are killed by injection with virulent Type S S. pneumoniae bacteria. (c) Injection with heat-killed virulent bacteria does not kill mice, but (d) if heat-killed Type S bacteria are mixed with nonvirulent Type R bacteria, they have the capacity to transform nonvirulent Type R bacteria into the virulent Type S form.
The "Transforming Principle" Is DNA
In 1944, Oswald T. Avery and his associates Colin M. MacLeod and Maclyn McCarty at the Rockefeller Institute made the discovery that the substance active in transforming Type R bacteria to virulence was, in fact, DNA. This finding was surprising and not immediately accepted because most scientists at the time thought that proteins, substances chemically more complex and diverse than nucleic acids, were the genetic material. Avery, MacLeod, and McCarty showed that highly purified preparations of "transforming principle" contained no detectable protein and were unaffected by trypsin or chymotrypsin (two proteolytic enzymes) or by pancreatic RNase (which hydrolyzes RNA). However, the transforming substance was readily inactivated by treatment with pancreatic DNase, an enzyme that specifically degrades DNA. Thus, DNA must have been the agent carrying the information that transforms R bacteria to virulence. Because transformation was stably inherited, DNA merited strong consideration as the actual material of heredity. DNA Is the Hereditary Molecule of BacteriophageFurther proof that DNA
is the material of heredity came from the study of bacteriophage. In 1952, Alfred
Hershey and Martha Chase devised an elegant experiment to trace the fates of
the two major components of bacteriophage¾coat
protein and DNA¾following infection. They
took advantage of the fact that nucleic acids lack sulfur and proteins lack
phosphorus to uniquely label bacteriophage DNA with 32P and bacteriophage
protein with 35S. Bacterio-phage labeled with either isotope were
obtained from cultures of bacteriophage T2 grown on Escherichia coli
in medium containing radioactive 32P-labeled inorganic phosphate
or radioactive 35S-labeled methionine.
Figure 29.2 × Electron micrograph of bacteriophage particle attached to a bacterial cell. A single T4 bacteriophage weighs 5 ´ 10-13 g and consists of 60% DNA and 40% protein. Its volume is about 1/1000 the volume of an E. coli cell. T4 phage heads are 100 nm ´ 65 nm icosahedra attached to tails 100 nm long by 25 nm in diameter. (J. Broek/Biozentrum, University of Basel Science Photo Library)
Phage infection
of bacteria involves attachment of the bacteriophage to the bacterial cell at
specific attachment sites. The phage DNA enters the bacterial cell, leaving
its protein coat behind on the surface of the bacterium (Figure 29.2). Hershey
and Chase mixed labeled bacteriophage T2 with unlabeled E. coli cells,
permitting sufficient time for the phage to attach. Then they vigorously agitated
the culture in a blender to shear the phage coats from the bacterial surface.
Following centrifugation of the culture, infected bacteria could be recovered
in the pellet, whereas the phage coats containing most of the 35S
label remained suspended in the supernatant. In contrast, when E. coli
cells were infected with 32P-labeled T2 phage, the bacterial pellet
contained most of the 32P. Furthermore, upon lysis, 30% of the original
32P but only 1% of the 35S was recovered in the bacteriophage
progeny produced by the infection (Figure 29.3). Hershey and Chase surmised
that the bacteriophage DNA was sufficient for bacteriophage reproduction. That
is, DNA must be the material of heredity.
Figure 29.3 × The Hershey and Chase experiement demonstrated that the DNA component of bacteriophage T2 carried the requisite genetic information for bacteriophage reproduction.
29.2 × Genetic Information in Bacteria: Its Organization, Transfer, and Rearrangement
Bacteria are very useful organisms for genetic analysis: Under optimal conditions of growth and reproduction, some bacteria (such as E. coli) divide every 20 minutes, the progeny of each division being a new generation. A genetic experiment can be completed with bacteria in hours, whereas an analogous experiment with a multicellular organism would take months or years because the generation times of such organisms are months or even years in duration. Further, a single milliliter of bacterial culture can contain enormous numbers of bacteria¾as many as 1010 ¾all derived from a single parental bacterium:Because of these vast numbers, very rare genetic events can be observed. That is, a one-in-a-million occurrence could be present in thousands of bacteria in a culture. In addition, because bacteria are haploid organisms (organisms with only one chromosome or one set of chromosomes), each cell contains but one set of genetic instructions. Consequently, any mutation in a gene is not masked or corrected by a second, normal copy of the gene, as it usually is in diploid organisms (organisms having two, essentially duplicate, sets of chromosomes). In haploid organisms like bacteria, the phenotype, or perceptible characteristics of the organism, reflects its genotype, or genetic composition. In contrast, diploid organisms may exhibit a wild-type, or normal, phenotype for any trait, even though their genotype might contain one mutant copy and one wild-type copy of the gene responsible for the trait.A single bacterium growing with a generation time of 20 min can give rise to 1010 progeny in less than 11 hr. N, the number of cells after n number of generations, is given by N = 2n. For N = 1010 = 2n, n = 33.22 (232.22 = 1010). At 0.33 hr per generation, 33.22 generations (the time to accumulate 1010 cells from a single bacterium) occur in about 11 hours.

Mapping the Structure of Bacterial Chromosomes In 1946, Joshua Lederberg and Edward Tatum discovered that genetic information could be transferred between bacteria. They used two strains of E. coli that differed in their growth requirements due to mutations each carried (Figure 29.4). One strain (thr-, leu-, thi-) required threonine, leucine, and thiamine to grow; the other (phe-, cys-, bio-) required phenylalanine, cystine, and biotin. These two strains were mixed together and spread on the surface of a petri plate of minimal medium lacking any of the required supplements. After a day, a very small number of bacterial colonies were observed to be growing. Somehow, these growing bacteria had acquired functional (wild-type) copies of each of the mutant genes. This remarkable result suggested strongly that the chromosomes of the two different cell types were brought together in a process akin to sexual exchange. In order for the progeny cells (which contain but one chromosome) to have acquired genetic information from the parental strains, genetic recombination must have occurred. This represents, in the words of Lederberg and Tatum, "the assortment of genes in new combinations." Apparently, at some point in time, parental DNA molecules must have aligned along regions of homology (sequence similarity), and segments from one of these molecules must have been interchanged with similar segments from the other parents so that some DNA molecules (chromosomes) now carried wild-type thr+ leu+ thi+ phe+ cys+ bio+ genes (Figure 29.4). Lederberg and Tatum speculated that, in order for the various genes to have had the opportunity to recombine, the cells of one strain must have interacted with the cells of the other.
Sexual Conjugation in Bacteria
Figure
29.5 ×
Electron micrograph of two E. coli cells, one F+,
the other F-, joined in sexual conjugation. The pilus joining
them in indicated by the arrow. (Fred Marsik/Visuals Unlimited)
The transfer of DNA between
bacteria takes place via a process known as sexual conjugation, a phenomenon
unsuspected prior to the Lederberg-Tatum experiment. Bacterial cells sometimes
contain, in addition to their chromosome, extrachromosomal DNA molecules called
plasmids (see Chapter
13). Plasmids represent "extra" or auxiliary genetic information.
Bacterial cells are capable of conjugation if they possess a particular plasmid
called the F factor (F for fertility). Such F+, or donor,
cells have thin, hollow tubes projecting from their surface known as sex
pili or F pili (singular = pilus). One or more pili can bind
to specific receptors on the surface of cells that lack an F factor (F-,
or recipient, cells; Figure 29.5). The pilus provides a connection between
the two cells. Upon conjugation, a single strand of the F factor is passed to
the F- cell, where its complementary strand is synthesized (Figure
29.6).The recipient F- cell thus becomes F+ by virtue
of now having a double-stranded F factor plasmid. The F factor plasmid consists
of about 94,000 base pairs; about one-third of this DNA is devoted to about
25 genes that function specifically in the transfer of genetic material from
F+ to F- cells. Among these genes are those necessary
for the formation of pili. In reality, the F factor is an infectious agent.
Figure 29.6 × Diagram showing the transfer of F factor from an F+ to an F- cell. A single strand of the F factor is nicked and transferred into the recipient F- cell. The complementary strand is then synthesized within the recipient F- cell to create a new double-stranded F factor, transforming the F- cell into an F+ one.
Figure
29.7 ×
Transfer of segments of the bacterial chromosome from the donor Hfr
to the recipient F- cell. Because complete transfer of the
Hfr chromosome rarely happens and because replicaiton begins at a site
within the F factor, transfer of the entire F factor is seldom achieved. Thus,
the recipient cell usually remains F-. The E. coli chromosome
can be mapped by interrupted mating of Hfr strins with F-
strains. The genetic markers here are thr-, leu-
(requirement fo threonine, or leucine, in order to grow), gal-,
lac- (inability to grow on galatctose, or lactose, as sole carbon
source), aziR (resistance to azice), tonR
(resistance to bacteriophage T1), and strR (resistance to
sreptomycin). The superscripts +, R, S denote
wild-type, resistance, and sensitivity, respecitvely. (a) Ordered transfer of
all chromosome during mating,. Mating is interupted by the shearing of the joined
cells in a blender at chosen intervals; this separates cells at vaious stages
in the transfer of the Hfr chromomsome. The cells are then plated onto
selective memdium and scored for their sensitivity to bacteriophage T1 and azide
and their ability to grow on glactose or lactose as sole carbon source. (b)
The frequencies of geneti markers aziS, tonS, lac+,
and gal+ among the recombinants as a function of mating time.
Extrapolation to zero gives and indication of when the various markder enter
the recipient cell. ( Adapted from Jacob, F., and Wollman, E., 1961. Sexuality
and the Genetics of Bacteria, New York: Acadmic Press, p. 135)
High Frequency of Recombination
In rare instances, the F factor will integrate into the bacterial host chromosome. (Plasmids capable of chromosomal integration are termed episomes.) Cells harboring F factor integrated into the chromosome show a much higher frequency of recombination of chromosomal genes upon conjugation, or "mating," with F- cells and so are referred to as Hfr cells, for "high frequency of recombination." In Hfr cells, the conjugal process determined by the F factor operates as it does when the F factor is acting autonomously (Figure 29.7). That is, a single strand is passed to the recipient F- cell, where its complementary strand is synthesized. However, because of its integrated position within the Hfr chromosome, the F factor carries along genes adjacent to it on the chromosome. If conjugation continues long enough, a single-stranded copy of the entire host chromosome is passed to the F- cell. However, conjugation rarely persists the 100 minutes or more required for complete transfer, so usually only part of the Hfr chromosome is transferred.
Figure 29.8 × The genetic map of the E. coli chromosome. This circular map is divided into 100 minutes. The 100 minutes arose historically as the time period necessary for complete gene transfer in interrupted mating experiments. The marker thrL is arbitrarily chosen as minute 0. The complete sequence of the E. coli genome (Science (1997) 277:1453-1474) encompasses 4405 open reading frames (ORFs) encoding some 4289 proteins.
29.3 × The Molecular Mechanism of Recombination
Genetic recombination is the natural process by which genetic information is rearranged to form new associations. Such recombination is a powerful genetic and evolutionary force that reshapes the genomes of all organisms. At the molecular level, genetic recombination is the exchange (or incorporation) of one DNA sequence with (or into) another. For example, homologous recombination involves an exchange of DNA sequences between homologous chromosomes, resulting in the arrangement of genes into new combinations. The process underlying homologous recombination is termed general recombination because the enzymatic machinery that mediates the exchange can use essentially any pair of homologous DNA sequences as substrates. Homologous recombination occurs during the production of gametes (meiosis) in diploid organisms. In higher animals, that is, those with immune systems, recombination also occurs in the DNA of somatic cells responsible for expressing proteins of the immune response, such as the immunoglobulins. This somatic recombination rearranges the immunoglobulin genes, dramatically increasing the potential diversity of immunoglobulins available from a fixed amount of genetic information (see Section 29.4). Homologous recombination can also occur in bacteria. Indeed, even viral chromosomes undergo recombination. For example, if two mutant viral particles simultaneously infect a host cell, a recombination event between the two viral genomes can lead to the formation of a recombination virus chromosome that is wild-type.
Figure
29.9
× Meselson and Weigle's experiment
demonstrated that a physical exchange of chromosome parts actually occurs during
recombination. Density-labeled, "heavy" phage, symbolized as ABC phage
in the diagram, was used to coinfect bacteria along with "light" phage,
the abc phage. The progeny from the infection were collected and subjected to
CsCl density gradient centrigugation. Parental-type ABC and abc phage were well-separated
in the gradient, but recombinant phage (ABc,Abc,aBc,aBC, and so on) were distributed
diffusely between the two parental bands because they contained chromosomes
constituted from fragments of both "heavy" and "light" DNA.
These recombinant chromosomes formed by breakage and reunion of parental "heavy"
and "light" chromosomes.
General Recombination
Recombination occurs by the breakage and reunion of DNA strands, so that a physical exchange of parts takes place. Matthew Meselson and J. J. Weigle demonstrated in 1961 that this happens by coinfecting E. coli with two genetically distinct bacteriophage l strains, one of which had been density-labeled by growth in 13C- and 15N-containing media (Figure 29.9). The phage progeny were recovered and separated by CsCl density gradient centrifugation. Phage particles that displayed recombinant genotypes were distributed throughout the gradient while parental (nonrecombinant) genotypes were found within discrete "heavy" and "light" bands in the density gradient. The results showed that recombinant phage contained DNA derived in varying proportions from both parents. The obvious explanation is that these recombinant DNAs arose via the breakage and rejoining of DNA molecules.
Figure 29.10 × The generation of progeny bacteriophage of two different genotypes from a single phage particle carrying a heteroduplex DNA region within its chromosome. The heteroduplex DNA is composed of one strand that is genotypically XYZ (the + strand), and the other strand that is genotypically XyZ (the - strand). That is, the genotype of the two parental strands for gene Y is different (one is Y, the other y).
A second important observation made during this type of experiment was that some of the plaques formed by the phage progeny contained phage of two different genotypes, even though each plaque was caused by a single phage infecting one bacterium. Therefore, some infecting phage chromosomes must have contained a region of heteroduplex DNA, duplex DNA in which a part of each strand is contributed by a different parent (Figure 29.10).
The Holliday Model
In 1964, Robin Holliday proposed a model for homologous recombination that has proven influential (Figure 29.11). The two homologous DNA duplexes are first juxtaposed so that their sequences are aligned. This process of chromosome pairing is called synapsis (Figure 29.11a). Holliday suggested that recombination begins by introduction of single-stranded nicks at homologous sites on the two paired chromosomes (Figure 29.11b). The two duplexes partially unwind, and the free, single-stranded end of one duplex begins to base-pair with its nearly complementary, single-stranded
Figure
29.11
× The Holliday model for homologuous
recomibination. The + signs and - signs label strands of like polarity. For
example, assume that the two strands running 5' ®
3' as read left to right are labeled (d); and the two strands running 3' ®
5' as read left to right are labeled -. Only strands of like polarity exchange
DNA during recombination. (See text for detailed description.)
Figure 29.12 × Model of RecBCD-dependent initiation of recombination. (a) RecBCD binds to a duplex DNA end and its helicase activity begins to unwind the DNA double helix. "Rabbit ears" of ssDNA loop out from RecBCD because the rate of DNA unwinding exceeds the rate of ssDNA release by RecBCD. (b) As it unwinds the DNA, SSB (and some RecA) bind to the single-stranded regions; the RecBCD endonuclease activity randomly cleaves the ssDNA, showing a greater tendency to cut the 3'-terminal strand rather than the 5'-terminal strand. (c) When RecBCD encounters a properly oriented c site, the 3'-terminal strand is cleaved just below the 39-end of c. (d) RecBCD now directs the binding of RecA to the 3'-terminal strand, as RecBCD endonuclease activity now acts more often on the 5'-terminal strand. (e) A nucleoprotein filament consisting of RecA-coated 3'-strand ssDNA is formed. This nucleoprotein filament is capable of homologous pairing with a dsDNA and strand invasion (see Figure 29.14). (Adapted from Figure 2 in Eggleston, A. K., and West, S. C., 1996. Exchanging partners: recombination in E. coli. Trends in Genetics 12:20-25; and Figure 3 in Eggleston, A. K., and West, S. C., 1997. Recombination initiation: Easy as A, B, C, D . . . c? Current Biology 7:R745-R749)
The RecBCD Enzyme ComplexThe proteins RecB
(140 kD; 1180 amino acids), RecC (130 kD; 1122 amino acids), and RecD
(67 kD; 608 amino acids) form a multifunctional enzyme complex having both helicase
and nuclease activity. The RecBCD complex initiates recombination by attaching
to the end of a DNA duplex (or at a double-stranded break in the DNA) and using
its ATP-dependent helicase function to unwind the dsDNA (Figure 29.12a). As
RecBCD progresses along unwinding the duplex, its nuclease activity cleaves
both of the newly formed single strands (although the strand that provided the
3'-end at the RecBCD entry site is cut more frequently than the 5'-terminal
strand [Figure 29.12b]).
Single-stranded DNA-binding protein (SSB) (and some RecA protein)
readily binds to the emerging single strands. Sooner or later, RecBCD encounters
a particular nucleotide sequence, a so-called Chi (or c)
site, characterized by the sequence 5'-GCTGGTGG-3'. These c
sites are recombinational "hot spots"; 1009 c
sites have been identified in the E. coli genome (on average, about one
every 4.5 kb of DNA). When a c sequence is encountered
by a RecBCD complex approaching its 3'-side (the ..G-3'-side), RecBCD cleaves
the c-bearing DNA strand four to six bases to the
3' side of c (Figure 29.12c). Interaction of RecBCD
with the c site causes the D subunit of RecBCD
to become irreversibly altered such that the RecBCD complex no longer expresses
nuclease activity against the 3'-terminal strand, but nuclease activity against
the 5'-terminal strand increases (Figure 29.12d).
Resuming its helicase function, RecBCD unwinds the dsDNA, and collectively these
processes generate a ssDNA tail bearing a site at its 3'-terminal end. This
ssDNA may reach serveral kilbases in length. At the same time, the c-altered
RecBDC complex now promotes preferential binding of RecA protein, instead of
SSB, to the 3'-terminal strand to form a Nucleoprotein filament (Figure
29.12e, active in pairing and strand invasion with a homologous region in another
dsDNA molecule.
The RecA Protein

Figure
29.13 × The
structure of RecA protein. (a) Ribbon diagram of the RecA monomer. Note the
ADP bound at the site near helices C and D. (b) RecA filament. Four turns of
a helical filament that has six RecA monomers per turn. A RecA monomer is highlighted
in red. (Adapted from figures 2 and 3 in Roca, A.I., and Cox, M.M., 1997.
RecA protein:Structure, function, and role in recombinational DNA repair. Progress
in Nucleic Acid Research and Molecualr Biology 56:127-223. Photos
courtesy of Michael M. Cox, University of Wisconsin)
The RecA protein, or recombinase, is a multifunctional 352-residue (38 kD) enzyme that acts in general recombination to catalyze the ATP-dependent DNA strand exchange reaction, leading to formation of a Holliday junction (Figure 29.11b-f). RecA protein (Figure 29.13a) crystallizes in the absence of DNA to form a helical filament having six monomers per turn (Figure 29.13b). This filament has a deep spiral groove large enough to accommodate three strands of DNA. RecA binds single-stranded DNA with a stoichiometry of one RecA per three nucleotides, and the resultant nucleoprotein filament has a helical pitch of 8.5 to 10 nm and about six RecA monomers per turn. The DNA in both RecA:ssDNA and RecA:dsDNA filaments is extended 150% relative to B-form DNA. The nucleoprotein filament formed by binding of RecA protein to the 3'-terminal ssDNA has affinity for other DNA molecules. In fact, binding of multiple DNA strands is the hallmark of RecA function.
Figure
29.14 ×
Model for the strand exchange function of RecA, as based on the relative
DNA affinities of the primary and secondary DNA-binding sites on RecA. (a) ssDNA
is bound in the primary DNA-binding site of RecA. (b) dsDNA is bound weakly
in the RecA secondary DNA-binding site, and RecA scans this dsDNA for homology.
(c) Homology recognition leads to DNA strand exchange as the RecA-bound ssDNA
forms a heteroduplex with a newly found complementary strand; this heteroduplex
fills RecA's primary DNA-binding site. The strand displaced from the dsDNA now
occupies the secondary DNA-binding site on RecA with higher affinity than dsDNA
did. Subsequent base pairing between this strand and the 3' ®
5' strand of the incoming DNA (greenish yellow) creates a Holliday junction.
(Adapted from Figure 5 in Mazin, A. V., and Kowalczykowoski, S. C., 1996.
The specificity
of the secondary DNA binding site of RecA protein defines its role in DNA strand
exchange. Proceedings
of the National Academy of Sciences, USA
93:10673 - 10678)
In recombination,
RecA uses its so-called high-affinity primary DNA-binding site to bind ssDNA
(Fig. 29.14). This complex then interacts with other DNA molecules through a
secondary DNA-binding site within RecA. This secondary site has higher affinity
for ssDNA than dsDNA. The relative affinity of this secondary site suggests
a mechanism for RecA in DNA strand exchange during recombination: a recA:ssDNA
nucleoprotein complex transiently binds dsDNA in its secondary site and scans
along the minor groove of the dsDNA, searching it for sequence homology with
its bound ssDNA. When homology is found, a hybrid DNA duplex is formed between
the ssDNA in the primary site and the complementary strand found in the scanned
dsDNA. Formation of this hybrid duplex displaces the strand of the scanned dsDNA
that is most like the ssDNA brought in by RecA. This process is the essence
of DNA strand exchange. The displaced DNA strand is bound with higher affinity
in the RecA secondary site than the dsDNA that was occupying this site. High-affinity
binding of ssDNA in 
Figure
29.15 × Model
for homologous recombination as promoted by RecA enzyme. (a) RecA protein (and
SSB) aid strand invasion of the 3'-ssDNA into a homologous DNA duplex, (b) forming
a D-loop. (c) The D-loop strand that has been displaced by strand invasion pairs
with its complementary strand in the origninal duplex to form a Holliday junction
as strand invasion continues.
the secondary site stabilizes
the strand-exchange complex and ensures proper heteroduplex formation by RecA
in homologously aligned DNA strands (Figure 29.14).
Procession of base unpairing of dsDNA and re-pairing into hybrid strands
along the DNA duplex initiates branch migration (Figure 29.15b). Branch
migration drives the displacement of the homologous DNA strand from the DNA
duplex and its replacement with the ssDNA strand, a process known as single-strand
assimilation (or single-strand uptake). Strand assimilation does
not occur if there is no homology between the ssDNA and the invaded DNA duplex.
The DNA strand displaced by the invading 3'-terminal ssDNA is free to anneal
with the 5'-terminal strand in the original DNA, a step that is also mediated
by RecA protein and SSB (Figure 29.15c). The result is a Holliday junction,
the classic intermediate in genetic recombination. Proteins that assist RecA
in the formation of a Holliday junction include RecF, RecO, and RecR.
Figure 29.16 × Model for the resolution of a Holliday junction in E. coli by the RuvA, RuvB, and RuvC proteins. (a) Ribbon diagram of the RuvA tetramer. RuvA monomers have an overall L shape (one of them is outlined by the dashed white line); four of them form a tetramer with fourfold rotational symmetry, a structure reminiscent of a four-petaled flower. (b) Model for RuvA/RuvB action (first suggested by Parsons, C. A., et al., 1995. Structure of a multisubunit complex that promotes DNA branch migration. Nature 374:375-378.) (left): The RuvA tetramer fits snugly within the Holliday junction point. (center): Oppositely facing RuvB hexameric rings assemble on the heteroduplexes, with the DNA passing through their centers. These RuvB hexamers act as motors to promote branch migration by driving the passage of the DNA duplexes through themselves. (right): Binding of RuvC at the Holliday junction and strand scission by its nuclease activity. The locations of the RuvC active sites are indicated by the scissors. (c) Charge distribution on the concave surface of an RuvA tetramer. Blue indicates positive charge and red, negative charge. Note the overall positive charge on this surface of (RuvA)4, with the exception of the four red (negatively charged) pins at its center. (d) Structural model for the interaction of (RuvA)4 with the hypothesized square-planar Holliday junction center. (Adapted from Figures 1, 2, and 3 in Rafferty, J. B., et al., 1996. Crystal structure of DNA recombination protein RuvA and a model for its binding to the Holliday junction. Science 274:415- 421)




The Holliday junction is then processed into recombination products by RuvA,
RuvB, and RuvC. Specifically, RuvA (203 amino acids) and RuvB (336 amino acids)
work together as a Holliday junction-specific helicase complex that dissociates
the RecA filament and catalyzes branch migration. An RuvA tetramer (Figure 29.16a)
fits precisely within the junction point (Figure 29.16b), which has a square-planar
geometry, and this RuvA tetramer targets the assembly of RuvB around opposite
arms of the DNA junction. The RuvB protein binds to form two oppositely oriented,
hexameric [(RuvB)6] ring structures encircling the dsDNAs, one on
each side of the Holliday junction. Rotation of the dsDNAs by the RuvB hexameric
rings pulls the dsDNAs through (RuvB)6 and unwinds the DNA strands
across the "spool" of RuvA, which threads the separated single strands
into newly forming hybrid (recombinant) duplexes (Figure 29.16b). The RuvA tetramer
is a disklike structure, one face of which has an overall positive charge (Figure
29.16c), with the exception of four negatively charged central pins, each contributed
by an RuvA monomer. These four pins fit neatly into the hole at the center of
the Holliday junction. The negatively charged sugar-phosphate backbones of the
four DNA duplexes of the Holliday junction are threaded along grooves in the
positively charged RuvA face, with the negatively charged central pins appropriately
situated to transiently separate the dsDNA molecules into their component single
strands through repulsive electrostatic interactions with the phosphate backbones
of the DNA. The separated strands of each parental duplex are then channeled
into grooves in the RuvA face, where they are led into hydrogen-bonding interactions
with bases contributed by strands of the other parental DNA to form the two
daughter hybrid duplexes flowing out from the RuvAB complex (Figure 29.16b).
Figure 29.16d illustrates a model for the RuvA tetramer with the square-planar
Holliday junction.
Depending on how the strands in the Holliday junction are cleaved and
resolved, patch or splice recombinant duplexes result (Figure 29.11g and h).
RuvC (173 amino acids) is an endonuclease that resolves Holliday junctions into
heteroduplex recombinant products (RuvC resolvase). An RuvC dimer binds
at the Holliday junction and cuts pairs of DNA strands of similar polarity (Figure
29.16b); whether a patch or a splice recombinant results depends on which DNA
pair is cleaved.
Recombination is a fundamental process that is involved not only in generating
genetic diversity, but is also involved in DNA repair and chromosome segregation
during cell division. Hexameric ring helicases such as RuvB are DNA-driving
molecular motors; similar motors act during DNA replication to propel strand
separation and initiate DNA synthesis. Thus, the RuvABC system for processing
Holliday junctions may represent a general paradigm for DNA manipulation in
all cells.

Figure
29.17 × The archetypal
transposon has inverted nucleotide-sequence repeats at its termini, represented
here as the 12-bp sequence ACGTACGTACGT (a). It acts at a target sequence (shown
here as the sequence CATGC) within host DNA by creating a staggered cut (b)
whose protruding single-stranded ends are then ligated to the transposon (c).
The gaps at the target site are then filled in, and the filled-in strands are
ligated (d). Transponson insertion thus generates direct repeats of the target
site in the host DNA, and these direct repeats flank the inserted transponson.
In 1950, Barbara McClintock
reported the results of her studies on an activator gene in maize (Zea
mays or, as it's usually called, corn) that was recognizable principally
by its ability to cause mutations in a second gene. Activator genes were thus
an internal source of mutation. A most puzzling property was their ability to
move relatively freely about the genome. As we have seen, scientists had labored
to establish that chromosomes consisted of genes arrayed in a fixed order, so
most geneticists viewed as incredible this idea of genes moving around. The
recognition that McClintock so richly deserved for her explanation of this novel
phenomenon had to await verification by molecular biologists. In 1983, Barbara
McClintock was finally awarded the Nobel Prize in physiology or medicine. By
this time, it was appreciated that many organisms, from bacteria to humans,
possessed similar "jumping genes" able to move from one site
to another in the genome. This mobility led to their designation as mobile
elements, transposable elements, or, simply, transposons.
Transposons are segments of DNA that are moved enzymatically from place
to place in the genome (Figure 29.17). That is, their location within the DNA
is unstable. Transposons range in size from several hundred bp to more than
8 kbp. Transposons contain a gene encoding an enzyme necessary for insertion
into a chromosome and for the remobilization of the transposon to different
locations. These movements are termed transposition events. The smallest
transposons are called insertion sequences, or ISs, signifying
their ability to insert apparently at random in the genome. Insertion into a
new site causes mutation because it disrupts the DNA sequence at that site.
Insertion occurs at sites that show little homology to the insertion sequence
or transposon. Although certain transposons (such as E. coli transposon
Tn 7) may undergo transposition once per cell generation, most transposition
events are infrequent, taking place only once every 104 to 107
generations. Larger and more complex transposons also carry genes that are not
involved in the enzymology of insertion and excision of the transposon, such
as genes conferring resistance to antibiotics. Episomes, plasmids that can reversibly
integrate into bacterial genomes, contain transposons.
29.4 × The Immunoglobulin Genes: Generating Protein Diversity Using Genetic Recombination
The immunoglobulin genes are a highly evolved system for maximizing protein diversity from a finite amount of genetic information. This diversity is essential for gaining immunity to the great variety of infectious organisms and foreign substances that cause disease. The Immune Response Only vertebrates show an immune response. If a foreign substance, called an antigen, gains entry to the bloodstream of a vertebrate, the animal responds via a protective system called the immune response. The immune response involves production of proteins capable of recognizing and destroying the antigen. This response is mounted by certain white blood cells¾the B and T cell lymphocytes and the macrophages. B cells are so named because they mature in the bone marrow; T cells mature in the thymus gland. Each of these cell types is capable of gene rearrangement as a mechanism for producing proteins essential to the immune response. Antibodies, which can recognize and bind antigens, are immunoglobulin proteins secreted from B cells. Because antigens can be almost anything, the immune response must have an incredible repertoire of structural recognition. Thus, vertebrates must have the potential to produce immunoglobulins of great diversity in order to recognize virtually any antigen.The Immunoglobulin G Molecule
Figure
29.18 × Diagram
of the organization of the IgG molecule. Two identical L chains are joined with
two identical H chains. Each L chain is held to an H chain via an interchain
disulfide bond. The variable regions of the four polypeptides lie at the ends
of the arms of the Y-shaped molecule. These regions are responsible for the
antigen recognition function of the antibody molecules. The actual antigen-binding
site is constituted from hypervariable residues within the VL and
VH regions. For purposes of illustration, some features are shown
on only one or the other L chain or H chain, but all features are common to
both chains.
Immunoglobulin G
(IgG or g-globulin) is the
major class of antibody molecules found circulating in the bloodstream. IgG
is a very abundant protein, amounting to 12 mg per mL of serum. It is a 150-kD
a2b2-type
tetramer. The a or H (for heavy)
chain is 50 kD; the b or L (for light)
chain is 25 kD. A preparation of IgG from serum is heterogeneous in terms of
the amino acid sequences represented in its L and H chains. However, the IgG
L and H chains produced from any given B lymphocyte are homogeneous in amino
acid sequence. L chains consist of 214 amino acid residues and are organized
into two roughly equal segments, the VL and CL regions.
The VL designation reflects the fact that L chains isolated from
serum IgG show variations in amino acid sequence over the first 108 residues,
VL symbolizing this "variable" region of the L polypeptide.
The amino acid sequence for residues 109 to 214 of the L polypeptide is constant,
as represented by its designation as the "constant light," or CL,
region. The heavy, or H, chains consist of 446 amino acid residues. Like L chains,
the amino acid sequence for the first 108 residues of H polypeptides is variable,
ergo its designation as the VH region, while residues 109 to 446
are constant in amino acid sequence. This "constant heavy" region
consists of three quite equivalent domains of homology designated CH1,
CH2, and CH3. Each L chain has two intrachain disulfide
bonds, one in the VL region and the other in the CL region.
The C-terminal amino acid in L chains is cysteine, and it forms an interchain
disulfide bond to a neighboring H chain. Each H chain has four intrachain disulfide
bonds, one in each of the four regions. Figure 29.18 presents a diagram of IgG
organization. Within the variable regions of the L and H chains, certain positions
are hypervariable with regard to amino acid composition. These hypervariable
residues occur at positions 24 to 34, 50 to 55, and 89 to 96 in the L chains
and at positions 31 to 35, 50 to 65, 81 to 85, and 91 to 102 in the H chains.
The hypervariable regions are also called complementarity-determining regions,
or CDRs, because it is these regions that form the structural site that is complementary
to some part of an antigen's structure, providing the basis for antibody:antigen
recognition.
Figure 29.19 × The characteristic "collapsed b-barrel domain" known as the immunoglobulin fold. The b-barrel structures for both (a) variable and (b) constant regions are shown. (c) A schematic diagram of the 12 collapsed b-barrel domains that make up an IgG molecule. CHO indicates the carbohydrate addition site; Fab denotes one of the two antigen-binding fragments of IgG, and Fc , the proteolytic fragment consisting of the pairs of CH2 and CH3 domains.
In the immunoglobulin
genes, the arrangement of exons correlates with protein structure. In terms
of its tertiary structure, the immunoglobulin G molecule is composed of 12 discrete
collapsed b-barrel domains, each domain
having a Greek key motif (see Figure
6.32). The characteristic structure of this domain is referred to as the
immunoglobulin fold (Figure 29.19). Each of IgG's two heavy chains contributes
four of these domains and each of its light chains contributes two. The four
variable-region domains (one on each chain) are encoded by multiple exons,
but the eight constant-region domains are each the product of a single exon.
All of these constant-region exons are derived from a single ancestral
exon encoding an immunoglobulin fold. The major variable-region exon probably
derives from this ancestral exon also. Contemporary immunoglobulin genes are
a consequence of multiple duplications of the ancestral exon.
The discovery of variability in amino acid sequence in otherwise identical
polypeptide chains was surprising and almost heretical to protein chemists.
For geneticists, it presented a genuine enigma. They noted that mammals, which
can make millions of different antibodies, don't have millions of different
antibody genes. How can the mammalian genome encode the diversity seen in L
and H chains?
Figure 29.20 × The organization of mouse immunoglobulin gene segments. The organization in germline cells is shown on the left, and the rearranged organization characteristic of mature B lymphocytes is shown to the right of the arrows. The rearranged states shown are but single examples of the many possibilities for each gene family. (Adapted from Tonewaga, S., 1983. Somatic generation of antibody diversity. Nature 302:575.)
he organization of various immunoglobulin gene segments in the mouse genome is shown in Figure 29.20. L-chain variable-region genes are assembled from two kinds of germline genes, VL and JL(J stands for joining). In mammals, there are two different families of L-chain genes, the k, or kappa, gene family and the l, or lambda, gene family; each family has V and J members. These families are on different chromosomes. In mice, 90% of the L chains are k chains; l L chains are a minor component. Mice have four functional JV k genes (and a fifth nonfunctional one); these J genes lie 2.5 to 4 kb upstream from the single Ck gene that encodes the L-chain constant region. There are at least 200 Vk genes, each with its own Lk segment for encoding the L-chain leader peptide that targets the L chain to the endoplasmic reticulum for IgG assembly and secretion. (This leader peptide is cleaved once the L chain reaches the ER lumen.) The l family of L-chain genes is organized a little differently, with only two Vl genes, each of which is followed downstream by a pair of Jl-Cl units (Figure 29.20). In different mature B lymphocyte cells, Vk and Jk genes have joined in different combinations, and along with the CV k gene, form complete LV k chains with a variety of Vk regions. However, any given B lymphocyte expresses only one Vk-Jk combination. Construction of the mature B lymphocyte L-chain gene has occurred by DNA rearrangements that combine three genes (L-Vk,l, Jk,l, Ck,l) to make one polypeptide!
DNA Rearrangements Assemble an H-Chain Gene by Combining Four Separate Genes
The first 98 amino acids of the 108-residue, H-chain variable region are encoded by a VH gene. Each VH gene has an accompanying LH gene that encodes its essential leader peptide. It is estimated that there are from 200 to 1000 VH genes and they can be subdivided into eight distinct families based on nucleotide sequence homology. The members of a particular VH family are grouped together on the chromosome, separated from one another by 10 to 20 bp. In assembling a mature H-chain gene, a VH gene is joined to a D gene (D for diversity), which encodes amino acids 99 to 113 of the H chain. These amino acids comprise the core of the third CDR in the variable region of H chains. The VH-D gene assemblage is linked in turn to a JH gene, which encodes the remaining part of the variable region of the H chain. The VH, D, and JH genes are grouped in three separate clusters on the same chromosome. The four JH genes lie 7 kb upstream of the eight C genes, the closest of which is Cm. Any of four C genes may encode the constant region of IgG H chains: Cg1 Cg2a, Cg2b, and Cg3. Each C gene is composed of multiple exons (as shown in Figure 29.20 for Cm, but not the other C genes). Ten to twenty D genes are found 1 to 80 kb farther upstream. The VH genes lie even farther upstream. In B lymphocytes, the variable region of a heavy-chain gene is composed of one each of the LH-VH genes, a D gene, and a JH gene joined head to tail. Because the H-chain variable region is encoded in three genes and the joinings can occur in various combinations, the heavy chains have a greater potential for diversity than the light-chain variable regions that are assembled from just two genes (for example, Lk-Vk and Jk). In making heavy-chain genes, four genes have been brought together and reorganized by DNA rearrangement to produce a single polypeptide!
Figure
29.21 × Consensus
elements are located above and below germline variable-region genes that recombine
to form genes encoding immunoglobulin chains. These consensus elements are complementary
and are arranged in a heptamer-nonamer, 12-bp to 23-bp spacer pattern. (Adapted
fro Tonewaga,S.,1983. Somatic generation of antibody diversity. Nature
302:575)
The Mechanism of V-J and V-D-J Joining in Light- and Heavy-Chain Gene Assembly
Specific nucleotide sequences adjacent to the various variable-region genes suggest a mechanism in which these sequences act as joining signals. All germline V and D genes are followed by a consensus CACAGTG heptamer separated from a consensus ACAAAAACC nonamer by a short, nonconserved 23-bp spacer. Likewise, all germline D and J genes are immediately preceded by a consensus GGTTTTTGT nonamer separated from a consensus CACTGTG heptamer by a short nonconserved 12-bp spacer (Figure 29.21). Note that the consensus elements downstream of a gene are complementary to those upstream from the gene with which it recombines. Indeed, it is these complementary consensus sequences that serve as recombination recognition signals (RSSs) and determine the site of recombination between variable-region genes. Functionally meaningful recombination happens only where one has a 12-bp spacer and the other has a 23-bp spacer (Figure 29.21). Lymphoid cell-specific recombination-activating gene proteins 1 and 2 (RAG1 and RAG2) recognize and bind at these RSSs, presumably through looping out of the 12- and 23-bp spacers and alignment of the homologous heptamer and nonamer regions (Figure 29.22). RAG1 and RAG2 together function as the V(D)J recombinase. The similarity between the organization of flanking repeats in immunoglobulin genes and the reaction catalyzed by RAG1/RAG2 proteins suggests that these genes and the RAG recombinase may have evolved from an ancestral transposon.
Figure 29.22 × Model for V(D)J recombination. A RAG1:RAG2 complex is assembled on DNA in the region of recombination signal sequences (a), and this complex introduces double-stranded breaks in the DNA at the borders of protein-coding sequences and the recombination signal sequences (b). The products of RAG1:RAG2 DNA cleavage are novel: the DNA bearing the recombination signal sequences has blunt ends, whereas the coding DNA has hairpin ends. That is, the two strands of the V and J coding DNA segments are covalently joined as a result of transesterification reactions catalyzed by RAG1:RAG2. To complete the recombination process, the two RSS ends are precisely joined to make a covalently closed circular dsDNA, but the V and J coding ends undergo further processing before they are joined (c). Coding-end processing involves opening of the V and J hairpins and the addition or removal of nucleotides from the strands. This processing means that joining of the V and J coding ends is imprecise, providing an additional means for introducing antibody diversity. Finally, the V and J coding segments are then joined to create a recombinant immunoglobulin-encoding gene (d). The processing and joining reactions require RAG1:RAG2, DNA-dependent protein kinase (DNA-PK, which consists of three subunits¾Ku70, Ku80, and DNA-PKCS ), and DNA ligase. (Adapted from Figure 1 in Weaver, D. T., and Alt, F. W., 1997. From RAGs to stitches. Nature 388:428- 429.)
Imprecis
Figure 29.23 × Recombination between the Vk and Jk genes can vary by several nucleotides, giving rise to variations in amino acid sequence and hence diversity in immunoglobulin L chains.
e Joining Joining of the ends of the immunoglobulin-coding regions during gene reorganization is somewhat imprecise. This imprecision actually leads to even greater antibody diversity because new coding arrangements result. Position 96 in k chains is typically encoded by the first triplet in the Jk element. Most k chains have one of four amino acids here, depending on which Jk gene was recruited in gene assembly. However, occasionally only the second and third bases or just the third base of the codon for position 96 is contributed by the Jk gene, with the other one or two nucleotides supplied by the Vk segment (Figure 29.23). So, the precise point where recombination occurs during gene reorganization can vary over several nucleotides, creating even more diversity. Antibody Diversity Taking as an example the mouse with perhaps 300 Vk genes, 4 Jk genes, 200 VH genes, 12 D genes, and 4 JH genes, the number of possible combinations is given by 300 x 4 x 200 x 12 x 4. Thus, greater than 107 different antibody molecules can be created from roughly 500 or so different mouse variable-region genes. Including the possibility for Vk-Jk joinings occurring within codons adds to this diversity, as does the high rate of somatic mutation associated with the variable-region genes. (Somatic mutations are mutations that arise in diploid cells and are transmitted to the progeny of these cells within the organism, but not to the offspring of the organism.) Clearly, gene rearrangement is a powerful mechanism for dramatically enhancing the protein-coding potential of genetic information. 29.5 × The Molecular Nature of Mutation Genes are normally transmitted unchanged from generation to generation, owing to the great precision and fidelity with which genes are copied during chromosome duplication. However, on rare occasions, genetically heritable changes (mutations) occur that result in altered forms. Most mutated genes function less effectively than the unaltered, wild-type allele, but occasionally mutations arise that give the organism a selective advantage. When this occurs, they are propagated to many offspring. Together with recombination, mutation provides for genetic variability within species and, ultimately, the evolution of new species.
Figure
29.24 × Point
mutations due to base mispairings. (a) An example based on tautomeric properties.
The rare imino tautomer of adenine base-pairs with cytosine rather than thymine.
(1) The normal A-T base pair. (2) The A*-C base pair is possible for the adenine
tautomer in which a proton has been transferred from the 6-NH2 of
adenine to N-1. (3) Pairing of C with the imino tautomer of A (A*) leads to
a transition mutation (A-T to G-C) appearing in the next generation. (b) A in
the syn conformation pairing with G (G is in the usual anti conformation). (c)
T and C form a base pair by H-bonding interactions mediated by a water molecule.
Point mutations are the
class of mutations in which one base pair is substituted for another. The two
possible kinds of point mutations are transitions, where one purine (or
pyrimidine) is replaced by another, as in A ®
G (or T ® C), and transversions, where
a purine is substituted for a pyrimidine or vice versa.
Point mutations arise
by the pairing of bases with inappropriate partners, by the introduction of
base analogs into DNA, or by chemical mutagens. Bases may rarely mispair (Figure
29.24), either because of their tautomeric properties (see Chapter
11), or because of other influences (such as purines flipping from anti
to syn conformations, or H2O molecules serving as bridging H-bond
donor/acceptors between two mispaired pyrimidines). Even in mispairing, the
C1'-C1' distances between bases must still be close to
that of a Watson-Crick base pair (11 nm or so¾see
Figure 11.20) to maintain the mismatched
base pair in the double helix. In tautomerization, for example, an amino group
(-NH2), usually an H-bond donor, can tautomerize to an imino form
(=NH) and become an H-bond acceptor. Or a keto group (C=O), normally an H-bond
acceptor, can tautomerize to an enol C-OH, an H-bond donor. Proofreading mechanisms
operating during DNA replication catch most mispairings. The frequency of spontaneous
mutation in both E. coli and fruit flies (Drosophila melanogaster)
is about 10-10 per base pair per replication.
Figure 29.25 × 5-Bromouracil usually favors the keto tautomer that mimics the basepairing properties of thymine, but it frequently shifts to the enol form, whereupon it can base-pair with guanine, causing a T-A to C-G transition.
Mutations Induced by Base Analogs Base analogs that become incorporated into DNA can induce mutations through changes in base-pairing possibilities. Two examples are 5-bromouracil (5-BU) and 2-aminopurine (2-AP). 5-Bromouracil is a thymine analog and becomes inserted into DNA at sites normally occupied by T; its 5-Br group sterically resembles thymine's 5-methyl group. However, because 5-BU frequently assumes the enol tautomeric form and pairs with G instead of A, a point mutation of the transition type may be induced (Figure 29.25). Less often, 5-BU is inserted into DNA at cytosine sites, not T sites. Then, if it base-pairs in its keto form, mimicking T, a C-G to T-A transition
Figure 29.26 × (a) 2-Aminopurine normally base-pairs with T, but (b) may also pair with cytosine through a single hydrogen bond.
ensues. The adenine analog, 2-aminopurine (recall adenine is 6-aminopurine) normally behaves like A and base-pairs with T. However, 2-AP can form a single H bond of sufficient stability with cytosine (Figure 29.26) that occasionally C replaces T in DNA replicating in the presence of 2-AP. Hypoxanthine (Figure 29.27) is an adenine analog that arises in situ in DNA through oxidative deamination of A. Hypoxanthine base-pairs with cytosine, creating an A-T to G-C transition.
Figure 29.27 × Oxidative deamination of adenine in DNA yields hypoxanthine, which base-pairs with cytosine, resulting in an A-T to G-C transition.
Chemical Mutagens
Figure 29.28 × Chemical mutagens. (a) HNO2 (nitrous acid) converts cytosine to uracil and adenine to hypoxanthine. (b) Nitrosoamines, organic compounds that react to form nitrous acid, also lead to the oxidative deamination of A and C. (c) Hydroxylamine (NH2OH) reacts with cytosine, converting it to a derivative that base-pairs with adenine instead of guanine. The result is a C-G to T-A transition. (d) Alkylation of G residues to give O6-methylguanine, which base-pairs with T. (e) Alkylating agents include nitrosoamines, nitrosoguanidines, nitrosoureas, alkyl sulfates, and nitrogen mustards. Note that nitrosoamines are mutagenic in two ways: they can react to yield HNO2 or they can act as alkylating agents. The nitrosoguanidine, N-methyl-N'-nitro-N-nitrosoguanidine, is a very potent mutagen used in laboratories to induce mutations in experimental organisms such as Drosophila melanogaster. Ethylmethane sulfate (EMS) and dimethyl sulfate are also favorite mutagens among geneticists.
Chemical mutagens are agents that chemically modify bases so that their base-pairing characteristics are altered. For instance, nitrous acid (HNO2) causes the oxidative deamination of primary amine groups, found in adenine and cytosine. Oxidative deamination of cytosine yields uracil, which base-pairs the way T does and gives a C-G to T-A transition (Figure 29.28a). Hydroxylamine specifically causes C-G to T-A transitions because it reacts specifically with cytosine, converting it to a derivative that base-pairs with adenine instead of guanine (Figure 29.28c). Alkylating agents are also chemical mutagens. Alkylation of reactive sites on the bases with methyl or ethyl groups alters their H-bonding and hence base pairing. For example, methylation of O6 on guanine (giving O6-methylguanine) causes this G to mispair with thymine, resulting in a G-C to A-T transition (Figure 29.28d). Alkylating agents can also induce point mutations of the transversion type. Alkylation of N7 of guanine labilizes its N-glycosidic bond, which leads to elimination of the purine ring, creating a gap in the base sequence. An enzyme, apurinic acid endonuclease, then cleaves the sugar-phosphate backbone of the DNA on the 5'-side, and the gap can be repaired by enzymatic removal of the 5'-sugar phosphate and insertion of a new nucleotide. A transversion results if a pyrimidine nucleotide is inserted in place of the purine during enzymatic repair of this gap. A number of alkylating agents are shown in Figure 29.28e.
Insertions and DeletionsThe addition or removal of one or more base pairs leads to insertion or deletion mutations, respectively. Either shifts the triplet reading frame of codons, causing frameshift mutations (misincorporation of all subsequent amino acids) in the protein encoded by the gene. Such mutations can arise when flat aromatic molecules such as acridine orange (see Figure 12.16) insert themselves between successive bases in one or both strands of the double helix. This insertion or, more aptly, intercalation, doubles the distance between the bases as measured along the helix axis. This distortion of the DNA (see Figure 12.16) results in bases being inappropriately inserted or deleted when the DNA is replicated. Disruptions that arise from the insertion of a transposon within a gene also fall in this category of mutation.
| Prions: Proteins as Genetic Agents? | |
| Prion
is an acronym derived from the words "protein infectious particle."
Prions are transmissible agents ("genetic material"?) that are
apparently composed only of a protein that has adopted an abnormal conformation.
The term prion was coined to distinguish such protein infectious
particles capable of causing disease from nucleic acid PrP, the prion protein, comes in various forms, such as Prpc, the normal cellular prion protein, and PrPsc, the scrapie form of |
PrP, a conformational variant of PrPc that is protease-resistant. These two forms are thought to differ only in terms of their secondary structure, with PrPc dominated by a-helical elements (figure, a), and PrPsc having both a-helices and b-strands (figure, b). It has been hypothesized that the presence of PrPsc can cause PrPc to adopt the PrPsc conformation. The various diseases are a consequence of the accumulation of the abnormal PrPsc form, which accumulates as amyloid plaques (amyloid = starch-like), causing vacuolarization of tissues in the central nervous system. The 1997 Nobel Prize in physiology or medicine was awarded to Stanley B. Prusiner for his discovery of prions. |
![]() |
|
| (Adapted from Figure 1 in Prusiner, S. B. (1966). Molecular biology and the pathogenesis of prion diseases. Trends in Biochemical Sciences 21:482-487.) | |
29.6 × RNA as Genetic Material
Whereas the genetic material of cells is double-stranded DNA, virtually all plant viruses, several bacteriophages, and many animal viruses have genomes consisting of RNA. In most cases, this RNA is single-stranded. Viruses with single-stranded genomes use the single strand as a template for synthesis of a complementary strand, which can then serve as template in replicating the original strand. Retroviruses are an interesting group of eukaryotic viruses having single-stranded RNA genomes that replicate through a double-stranded DNA intermediate. Further, the life cycle of retroviruses includes an obligatory step in which the dsDNA is inserted into the host cell genome in a transposition event. Retroviruses are responsible for many diseases, including tumors and other disorders. HIV-1, the human immunodeficiency virus that causes AIDS, is a retrovirus. Tobacco mosaic virus (TMV), an RNA virus infecting plants, was instrumental in establishing that nucleic acids are the substance of heredity. TMV has a molecular mass of 40 x 103 kD and consists of an RNA genome (3 x 103 kD) packaged in a protein coat made of 2130 identical protein chains of 18 kD each (see Figure 1.24). In 1956, Gierer and Schramm demonstrated that the RNA itself was able to produce viral lesions on the surfaces of tobacco leaves, if the leaf surface was lightly scratched so the RNA could gain access to the cells. In 1957, Fraenkel-Conrat and Singer used two different strains of the virus, HR and TMV, and reconstructed virus particles in vitro by mixing isolated proteins and RNAs in the 4 possible combinations:
TMV protein + HR RNATMV protein + TMV RNA
HR protein + HR RNA
HR protein + TMV RNA
These reconstituted virus particles were infective, and, when the virus progeny obtained after their infection of host plants were examined, it was found that the protein coat borne by the progeny virus particles was determined by the source of RNA in the virus infecting the plant: TMV RNA always yielded TMV protein coats in the progeny; HR RNA yielded HR protein coats. This experiment was early proof that nucleic acids, not proteins, are the repository of genetic information.

Figure
29.29 × Transfection can
introduce new genes into animals. The rat growth hormone gene carried on a plasmid
is injected into a mouse oocyte or fertilized egg that is then implanted in
a receptive female mouse. Integration of the plasmid into the mouse genome can
be ascertained by Southern analysis of DNA from the newborn mouse. Experssion
of the foreign gene can be determined by assaying for the gene product, in this
case, rat growth hormone.
An exciting new advance
in gene transfer techniques is the ability to introduce genes into animals by
transfection. Transfection is defined as the uptake or injection
of plasmid DNA into recipient cells. Animals that have acquired new genetic
information as a consequence of the introduction of foreign genes are termed
transgenic. The methodology involves the injection of plasmids carrying
the gene of interest into the nucleus of an oocyte or fertilized egg, followed
by implantation of the egg into a receptive female. The technique has been perfected
for mice (Figure 29.29). In a small number of cases, 10% or so, the mice that
develop from the injected eggs carry the transfected gene integrated into a
single chromosomal site. The gene is subsequently inherited by the progeny of
the transfected animal as if it were a normal gene. Expression of the donor
gene in the transgenic animals is variable because the gene is randomly integrated
into the host genome and gene expression is often influenced by chromosomal
location. Nevertheless, transfection of animals has produced some startling
results, as in the case of the transfection of mice with the gene encoding the
rat growth hormone (rGH). The transgenic mice grew to nearly twice the
normal size (Figure 29.30). Growth hormone levels in these animals were several
hundred times greater than normal. Similar results were obtained in transgenic
mice transfected with the human growth hormone (hGH) gene.
Figure 29.30 × Photograph showing a transgenic mouse with an active rat growth hormone gene (left). This transgenic mouse is twice the size of a normal mouse (right). (Photo courtesy of Ralph L. Brinster, School of Veterinary Medicine, University of Pennsylvania.)
The biotechnology of
transfection has been extended to farm animals, and transgenic chickens, cows,
pigs, rabbits, sheep, and even fish have been produced. The first animal cloned
from an adult cell, a sheep named Dolly, represented a milestone in cloning
technology. Subsequently, a transgene construct has been used to incorporate
the human gene encoding blood coagulation Factor IX into fetal sheep fibroblast
cells, and the nuclei from these cells were successfully transferred into sheep
oocytes lacking nuclei. These transgenic oocytes were placed in the uterus of
receptive female sheep, which subsequently gave birth to transgenic lambs. The
Factor IX transgene construct was specifically designed so that Factor IX protein,
a medically useful product for the treatment of hemophiliacs, would be expressed
in the milk of the transgenic sheep. Similar successes in cows, which produce
much more milk, has brought the potential for commercial production of virtually
any protein into the realm of reality.
Such genetic engineering
is anticipated eventually to have a major impact on human health. The human
genes encoding the a- and b-globin
chains of hemoglobin have recently been microinjected into fertilized mouse
eggs, and the transgenic mice that developed contained authentic human hemoglobin.
Human Hb isolated from transgenic mouse erythrocytes had an oxygen-binding curve
identical to that of human HbA, demonstrating that functional human hemoglobin
can be synthesized in mice. Transgenic pigs producing human Hb are touted as
a source of "human blood substitute" potentially useful during surgical
procedures. Such transfection technology also holds promise as a mechanism for
"gene therapy" by replacing defective genes in animals with functional
genes (Chapter 13).
Problems concerning integration and regulation of the transfected gene, including
its appropriate expression in the right cells at the proper time during development
and growth of the organism, must be brought under control before such therapy
becomes commonplace.