close

Вход

Забыли?

вход по аккаунту

?

Molecular Self-Organization and the Origin of Life.

код для вставкиСкачать
1561 F. A . L. Anet, A . J . R. Bourn, J. Am. Chem. SOC.89, 760 (1967).
[57] D. S. Kabakofl, E. Namanworrh, J. Am. Chem. SOC.92, 3234 (1970).
1581 S. A . Sherrod, V. Boekelheide. J. Am. Chem. SOC.94, 5513 (1972); S. A .
Sherrod, R . L. d a c o s t a , R . A . Barnes, V. Boekelheide. ibid. 96, 1565
(1974).
(591 B. E. Mann, J. Magn. Reson. 25, 91 (1977).
1601 F. W. Dahlquisi, K . J . Longmuir, R . B. DuVernet, J. Magn. Reson. 17,
406 (1975); I . D . Campbell, C. M . Dobson, R . G. Ratcl@e, R . J . P. Williams, ibid. 29, 397 (1978).
(611 M. Feigel. H. Kessler, D. Leibfniz, A . Walter, J. Am. Chem. SOC.101,
1943 (1979).
I621 J . S. Leigh, Jr., I. Magn. Reson. 4, 308 (1971).
I631 A. C. McLaughlin, J . S. Leigh, Jr., J. Magn. Reson. 9, 296 (1973).
I641 H. Srrehlow. J . Frahm, Ber. Bunsenges. Phys. Chem. 79, 57 (1975).
I651 J . B. b m b e r t , J. W. Keepers, J. Magn. Reson. 38, 233 (1980).
Molecular Self-organization and the Origin of Life‘**]
By Hans Kuhn and Jiirg Waserr*]
The sequence of many small physically and chemically plausible steps that lead to self-organization of matter are considered. These are governed by periodic temperature changes
and a multi-faceted spatial environment, both of which occurred in suitable locations on
the primordial planet. The specific model described reveals the logical framework of this
process, the nature and locations of fundamental difficulties as well as the means by which
they might be overcome. The barriers that must be surmounted are mostly related to an accumulation of copying errors. An early stagnation barrier was hurdled by aggregate formation, by means of which erroneous copies are rejected; a further barrier was overcome by
the evolution of machinery capable of synthesizing “cellular” envelopes which confine the
building components. A system evolved which produced a primitive “replicase” that stabilized a rudimentary genetic code. A later stagnation phase ended when the functional system
was reorganized by the evolution of separate machinery for replication and for translation
of genetic information. - The purpose of this account is to stimulate experiments and theoretical efforts towards the improvement, refinement, and expansion of the model described.
It also demonstrates the fruitfulness of the present style of approach that leads to assertions
about the prerequisites, logical framework, and organizational structure of evolutionary
processes.
1. The Genetic Blueprint and its Translation
A feature of all living systems is their ability to produce
copies of themselves. They consist of macromolecules
functioning together as an entity in a similar way to the interacting parts of a machine. Living individuals contain
their own blueprint along a nucleic acid strand, in the form
of a specific sequence of four kinds of nucleotides. During
multiplication of an individual this information is copied
by replication of the nucleic acid strand. The blueprint can
be translated into proteins, that is, linear sequences of
twenty kinds of amino acids. This step is accomplished by
[*] Prof. Dr. H. Kuhn
Max-Planck-Institut fur Biophysikalische Chemie
Am Fassberg, D-3400 Gtittingen-Nikolausberg (Germany)
Prof. Dr. J. Waser
La Jolla, California (USA)
formerly California Institute o f Technology, Pasadena, California (USA)
[**I Based on a lecture presented at the 111. Versammlung der Gesellschaft
Deutscher Naturforscher und Ante in Hamburg, September 23, 1980
500
0 Verlag Chemie GmbH, 6940 Weinheim, 1981
means of adapter or transfer ribonucleic acid molecules.
For every amino acid a,,a,,a,. .. at least one specific molecule of this type, to which it can become attached exists,
which also carries a specific anticodon nucleotide triplet.
Adapter molecules can in turn couple by complementary
base pairing to a nucleic acid strand that contains the blueprint for the protein in question (messenger ribonucleic
acid). Attachment of the base triplet of an adapter molecule to a base triplet on a messenger nucleic acid strand
can only occur if corresponding base pairs are complementary. The four bases G (guanine), C (cytosine), A (adenine), and U (uracil) are pairwise complementary; bases G
and C can readily form three hydrogen bonds with each
other, while bases A and U can readily form two hydrogen
bonds with each other. Put more simply, if there is a C on
the messenger strand there must be a G in the corresponding place on the adapter molecule, etc.; for example, the
first codon triplet ACU on the messenger strand (read in
the 5‘3‘-direction) must correspond to an adapter molecule
with the anticodon triplet UGA (read in the 3’5’-direction),
which carries e . g . the amino acid a, (Fig. 1). Amino acids
0570-0833/81/0707-0500 $ 02.50/0
Angew. Chem. Int. Ed. Engl. 20, SOO-520 (1981)
are therefore linked into protein chains with an amino acid
sequence that is determined by the blueprint. These proteins then form the functional structure of the organism.
Fig. 1. Simplified version of present-day protein synthesis, guided by a blueprint nucleic acid strand (top), to which adapter molecules carrying the amino acids a , , a2, a, are attached. Once in favorable juxtaposition the amino
acids can form polypeptide bonds (bottom).
Errors during the copying of the blueprint may cause
changes in the proteins. Such errors are usually disadvantageous, but in rare cases lead to improved survival
chances of the changed individuals. Individuals survive
that are best adapted to the environment. Such progressive
adaptation to the environment represents a process of
“learning” for the system that requires many generations.
2. The Method of Model Paths
How could simple systems of this kind, that learn by
evolution arise? How could machinery for the translation
of genetic information that contains proteins as essential
translation products arise? How can this chicken-and-egg
problem, be solved? Can the laws of physics be applied towards its understanding?
The appearance of the first system capable of learning
represented a jump in quality, in which a fundamental
property of matter suddenly manifested itself. Systems began to be carriers of information, of a meaningful message,
with a content capable of growing as the learning process
advanced. Prior to this not even the faintest trace of this
property existed, but once the breakthrough had occurred,
the process of learning went on inexorably via the continued confrontation of evolving systems with their surroundings and their adjustment to environmental changes by
multiplication, mutation and selection.
Experimental data capable of directly showing how this
manifestation-the sudden appearance of learning machines and the slow evolutionary process leading to the genetic apparatus of present-day biological systems could
have come about-is not available. The search for such
model paths is important not only as an aid in recognizing
these astonishing phenomena as a consequence of plausible physical processes, but also to indicate particularly important steps and in this way to stimulate experiments that
might be fruitful. The task of indicating promising experiAngew. Chem. Int. Ed. Engl. 20. 5,oO-520 (1981)
ments is an important aspect of the ideas discussed here,
and it is therefore of value to describe the models in a concrete and specific manner.
In more general considerations of self-organization it is
easy to overlook the crucial difficulties, and the detailed
consideration of an imaginable path constitutes a method
of avoiding this problem. However, it cannot be expected
that the model steps considered furnish an accurate description of the events that actually occurred.
It is also important not to lose sight of the logical relationship between the different steps, a relationship that is
easily buried in the necessarily cumbersome details required for an adequate description of the steps. Each step
only leads to the next, and so the overall logical structure
of the model is not apparent until the very end (see Section
14, Fig. 24). It is this overall structure that represents the
essence of the model, a structure that is not affected, even
if some of the details in the steps may have to be
changed.
The methodological program for recognizing and understanding the grand connections in the process of self-organization, which we describe here, therefore involves consideration of specific paths consisting of many simple
steps[’,21.A good illustration is the description of how a
translation apparatus might have arisen. The presentation
is kept simple in order to point out decisive relationships
in the clearest possible way.
A machine is constructed by fitting its parts together uia
external directed action. In a similar way, molecular functional cooperatives can be produced artificially, and this is
the aim of the Abteilung fur Molekularen Systemaufbau in
the Max-Planck-Institut fur Biophysikalische Chemie. Molecules are forced together in a planned way by directed
external action. For example, if suitable molecules on the
surface of liquids are pushed together, the monomolecular
layers thus formed aggregate, to produce the functional cooperative[31.On the primordial planet the role of the experimentalist is replaced, in a way, by the enormous variety of environmental influences.
3. Some Results of Prebiotic Chemistry
The most important components of our model for the
origin and the earliest steps of life are amino acids, ribose,
and the nucleotide bases G, C, A, and U. These substances
were presumably present on the primordial planet and
might have accumulated in particular regions by natural
concentration processes, such as evaporation of an aqueous solution and redissolution of the residue, or by adsorption and desorption. Using simulations of conditions
thought to have existed on prebiotic earth, many researche r ~ [ have
~ - been
~ ~ ~able to obtain nucleotide bases, sugars,
and amino acids from the gases CH4, C02, H20, N2, and
NH3 which presumably formed the reducing atmosphere
of the planet“’. It has also been possible to demonstrate
that these compounds can be made to yield nucleotides
[*I It is also quite possible that meteorites, known to contain nucleobases
and amino acids, were the initial sources of these building blocks [26--311.
They could have accumulated in particular locations by adsorption and desorption 1321.
501
and oligonucleotides, on the one hand, and activated
forms of amino acids on the other, under conditions believed to be realistic. Moreover, OrgeZ'211has recently succeeded in the enzyme-free polymerization of nucleotides
on nucleic acid templates, more than 90% of the nucleotides of the replicate strands being complementary to those
on the template strands. Some important results of this
work are summarized in Scheme 1, in steps a-m.
Reducing atmosphere
CH4, CO, CO2, NH,, N,, HzO
Electrical
discharge
\
I
HCN,
HCS-CN,
CHzO
dl
+
Purines
Pyrimidines
Amino acids
Ribose
f
Nucleosides
Nucleotides
Cytosine has been prepared by the reaction of propynenitrile and urea'"].
N
H C E C - C N + (H,N),CO
4
H
H
Cytosine
Ribose has been obtained from formaldehyde in the
presence of alumina and kaolinite[''l.
14 of the 20 amino acids that are protein components
could be obtained by electrical discharge and FischerTropsch synthesis in the presence of solid catalysts
(nickel-iron, magnetite, clays), and by Strecker synthesis"2-131
HCN/NH,
R C H O -RCH-CN
Hi0
l
NH2
RCH-COOO
Nucleosides have been prepared by the evaporation of
aqueous solutions of purines and ribose (or 2-deoxyribose) containing magnesium chloride"41.
Heating nucleosides with inorganic phosphates and
urea, in the presence of magnesium salts, has yielded
mononucleotides (5'-triphosphates). In the absence of
Mg2+ a mixture of 5'-, 3'-, and 2'-phosphates was prod~ced''~~.
Adenosine-(Ado-)oligophosphates such as ATP could
be converted into nucleoside 5'-phosphoimidazolides
by the evaporation of aqueous solutions containing
MgC12"61.
Nucleoside 5-phosphoimidazolide
1
Oligonucleotides
Activated
amino acids
+
Oligopeptides
Oligonucleotides
of template induced
sequence
Scheme 1. Possible processes on the prebiotic earth. a) to m) see text
a) Electrical discharges in mixtures of these gases have led
to HCN, H2, CH20, propynenitrile, and hydrocarbon~'~-~'.
b) Adenine and guanine have been obtained by cyclic oligomerization of HCN and hydrolysi~''-~].
4HCN
- "'Xy) HCN
HzN N
H
N??)
N
'
Adenine
N
H
Adenosine and uridine 5'-phosphoimidazolides have
been shown to form oligonucleotides with five and
more chain members in a reaction catalyzed by Pb"
ions~"].
Several cases of template-induced polymerizations of
nucleotide derivatives have been observed. Of particular interest are those of nucleoside 5'-phosphoimidazolides, presumed to be available under prebiotic conditions (see h). Guanosine 5'-phosphoimidazolide
(ImpG) has been polymerized on a polycytidylic acid
template in the presence of Zn2+, forming chains of
30-40 members that are predominantly connected in
the 3'-, 5'-positions, in an analogous way to the bonding
in biological nucleic acids.
When a mixture of ImpC and ImpG and a polycytidylic acid template are used, ImpG (that is, the compound containing the base complementary to the base
in the template) is incorporated with high selectivity
into the growing
Adenosine(Ad0-) 5'-phosphoimidazolide has been demonstrated to form aminocyl adenylates1l6].
A?
N+N-P-O-Ado
0::
502
H,N%HRCOO~
::?
@
* H3N-CH-C-O-P-O-Ado
R
I
0,
I=\
+ NvN@
Angew. Chem. Inr. Ed. Engl. 20, 500-520 (1981)
m) Aminoacyl adenylates have been polymerized in aqueous solution in the presence of specific clays[23-zs1.
The results in Scheme 1 support the assumption that
short strands are able to replicate under special conditions,
and it therefore seems promising to search for such conditions. Attempts to achieve template-induced polymerization using deoxyribose instead of ribose were unsuccessful@’],a result that supports the assumed model i.e. that
the first carrier of genetic information was RNA and that
DNA came into the picture at a later stage, when a genetic
apparatus for producing enzymes became available.
The results demonstrate that prebiotic synthesis of energy-rich nucleotide derivatives, oligonucleotides, and activated amino acid derivatives required solid-state reactions
as well as reactions in aqueous solution and in the gas
phase. It is plausible that such substances could have accumulated on primeval earth only in particular locations
where a multitude of special conditions were fulfilled, that
allowed a succession of very different reactions to occur,
requiring highly diversified and structured regions.
4. First Steps in the Origin of Life
Chain molecules consisting of two kinds of complementary monomers in a random sequence are considered here
as examples of very simple systems with the property of
producing copies of themselves. Suppose that such strands
may have arisen occasionally by the accidental condensation of monomers. Such strands can then replicate by serving as templates (Fig. 2). Monomers in the solution can at-
5’
-la
5’
3
Fig. 2. Replication along a template strand. The template strand and its replicate form a double helix.
tach themselves to complementary monomers along the
given strand. The monomers become interlinked and form
a second strand, the replicate or (-)-strand. The two
strands separate; on the new strand a new replicate forms
that is identical with the original or (+)-strand. Repeated
strand replication can, of course, take place only with suitable kinds of monomers and requires appropriate conditions, such as highly specific periodic temperature
changes. Since the primordial planet exhibited an immense
variety of environmental conditions, it is almost certain
that the required temperature variations existed in small
localized regions.
We may thus imagine a short nucleic acid strand on primordial earth containing the two complementary bases
guanine and cytosine that are anchored together in the
Angew. Chem. Int. Ed. Engl. 20. 500-520 (1981)
strand by ribose and phosphate groups and are capable of
interlinking by three hydrogen bonds. It is further supposed that this short strand has diffused into one of the
special regions mentioned, after having been formed elsewhere by accidental condensation during the drying of a
solution of monomers. In this strand, all of the monomers
are presumed to be linked in such a way that the strand
can serve as a precise template for replication, the chain
members being associated by 3‘5’-linkages. During replication, a double helix is formed, as in all nucleic acids present in biosystems. The special aspects of such a template
strand are that its monomers are spatially situated in a way
that is appropriate for the attachment of replicate monomers, which are then in a favorable position to interlink into
the daughter strand. A helical arrangement favors rapidity
and precision of chain replication, because the environment of each new building block that is to be attached is
the same as that of the preceding block, resembling the situation of a sequence of steps in a spiral stairway. The
bases of neighboring nucleotides along the template and
growing daughter strands are stacked, increasing with each
added nucleotide the energetic stability of the growing
double helix that involves interaction between solvent molecules as well as the stacked bases. The template-directed
polymerization runs along the 3’5’-direction of the template strand, while the direction of the daughter strand is
in the opposite sense, its 5’-end being at the 3‘-end of the
template strand.
All of this requires monomers of the same chirality. It is
accidental which chirality the original strand has, since the
process would work just as well if the mirror images of all
monomers were used. However, the appearance of an appropriate template strand is decisive for the chirality of all
subsequent daughter strands. Hence, it is not surprising
that the chirality of the building blocks in all living systems
of a given kind is the same.
By considering a solution of the different monomers that
may have existed on earth at the location considered (see”]
Section 18.1.4.1.), it is possible to estimate the probability
with which the very special original template strand can be
formed through accidentally correct condensation. In this
way, the plausibility of such a step can be justified for
strands containing, for example, ten monomeric
units. There should be one correct strand in approx. 0.1
mmol of strands of randomly condensed monomers, i. e. in
one of loz0strands (cf. also Section 17). For longer strands
the probability that all monomers are linked in the correct
way is much smaller, so that model considerations must
begin with short strands.
The probability of the spontaneous appearance of a correct strand of 10 monomers may be compared with that of
obtaining an unbroken sequence of 26 sixes when throwing dice. The latter probability is (1/6)26 or also about
The example may clarify the proposition that by
trying one’s luck with a sufficient number of dice even
such an exceedingly improbable event can be made to occur with near certainty. Thus, when 10 x 626=lo2’ dice are
used simultaneously the event is expected to occur for
about 10-2n102’= 10 dice. It can therefore be assumed
with a probability close to unity that at least one of the
dice used shows a six in each of 26 successive throws
503
(more exactly with a probability of 1 -[l - (1/6)26]10.62n
= 1 - (1/e) l o =0.99995).
By continued replication many such strands are formed,
once an appropriate template strand has arisen and has accidentally diffused into a favorable location in which the
conditions required for replication prevail. Disregarding
processes of loss there should be 2 strands after one generation, 4=2’ strands after 2, and 2“ strands after n generations. Eventually, a stationary state is reached in which as
many strands are created by replication as disappear by
losses e. g. by strand diffusion from the favorable region.
5. Barrier Caused by Too Many Errors
Every so often, strands become lengthened by the accidental condensation of two short strands. This can readily
occur, in contrast to the spontaneous formation of a longer
strand by the direct linkage of monomers. Longer strands
diffuse more slowly and thus have better chances to stay in
the favorable region than short ones. As time passes, ever
longer strands are made and the shorter ones disappear.
However, with growing strand length the probability increases that, during replication, a monomer is built into a
“lethal” daughter strand, preventing renewed replication
of the new strand, e. g. because some monomer contains an
incorrect sugar. Hence there is an upper limit to feasible
chain length extension and as shown by quantitative considerations this is reached with about 50 monomers (see 12],
Section 18.1.4.1. therein).
It can also happen that a non-complementary nucleotide
( e . g . G instead of C) is incorporated as monomer during
replication. The new strand can still be used as a template
for replication, but its nucleotide sequence is different
from that of the parent strand. In time, strands with all
possible sequences are therefore formed.
Strands may fold into specific conformations by internal
pairing of complementary bases, provided the base sequence is appropriate“]. Such conformations can confer
selective advantages upon existing forms, as the common
aspects of similar strands will be called. Some conformations make strands resistant against chemical attack, for
example, by the mutual protection of adjoining regions.
However, any given conformation, such as that of a partial
hairpin (Fig. 3), is tied to an exact sequence of nucleotides
and is therefore lost by almost any new error. If a favorable folding conformation has arisen, by accident, it is soon
lost because of the high level of replication errors with 50
monomers, the approximate number for any reasonable
sophisticated folding conformation.
The model thus leads to difficulties at this point. An insurmountable limit of the amount of information that can
be transferred by replication appears to have been
[*] For transfer RNA, such conformations have been established by X-ray
structure determinations [33]. The melting of different regions of single
strands, upon temperature increase, has been investigated by examining
high-resolution ‘H-NMR spectra that reveal gradual transitions from folded
to unfolded conformations [34]. Upon cooling, the original folding conformations establish themselves error-free. The change between the unfolded
single-stranded and the double-stranded forms of nucleic acids occurs hetween about 30 and 1OO”C, and one may count on the existence of such temperatures on prebiotic earth.
504
Fig. 3. Nucleotide sequence that permits a partial hairpin conformation
reached. It is hard to see how an accumulation of errors
can be avoided in such first self-reproducing systems, how
a form does not “forget” what it has “learned”.
6. Faulty Replicates Rejected During Aggregate
Formation
This barrier can, however, be surmounted by a very simple mechanism that nonetheless has far-reaching consequences within the model. Consider a strand, that is capable of assuming the conformation of a hairpin along its entire length, with the hairpin bend at the middle of the
strand because of an appropriate sequence of nucleotides
(Fig. 4). A strand that arises by the template replication of
Fig. 4. Left: Possible arrangement of bases G and C that permits a hairpin
conformation along the entire strand. The supposition that early strands contained, mainly or exclusively, just these two nucleotides is made plausible by
comparative studies of nucleotide sequences of different transfer ribonucleic
acids by Eigen and Winkler 1411. Middle: Schematic drawing of hairpin
strand conformation in which the ‘‘legs’’ of the hairpin form a double helix.
The outline is that expected for the van der Waals radii. Right: Molecular
model of hairpin strand. For clarity the upper portion is shown as a ball- and
stick model, the lower portion as a spacefilling model. Double helix is lefthanded as recently found by Rich ef ol. [35]and Dickerson ef ol. 1361 in an Xray crystal analysis of guanine-cytosine oligonucleotides, and by Arnoff ef u/.
[37] in GC DNA fibers. A left-handed double helix had been discussed by
fohl and Jouin (381 as the result of a phase transition at high salt concentration.
Angew. Chem. Inf. Ed. Engl. 20, 500-520 (1981)
a hairpin is automatically capable of again assuming a
hairpin conformation (cf. Fig. 13). From detailed examinations of space-filling models it turns out that in a suitable
medium, such hairpin strands can form aggregates with
amazing precision (Fig. 5). The precision of aggregation is
-Time
Small region
\
Fig. 5. Aggregation of two convoluted hairpin strands
such that faulty hairpins are rejected during aggregate formation, so that in this way an all-important error-filtering
mechanism is brought into being. The aggregates may be
stabilized by suitable bivalent cations, such as Ca’+ ions,
that hold together negatively charged phosphatidyl groups
on the outside of neighboring hairpin strands. The number
of monomers considered to comprise such hairpin strands
roughly corresponds to that of today’s transfer ribonucleic
acids (70-80 monomers), a point that is significant and to
which we will return.
The survival chances of strands in an aggregate are increased by their mutual protection against chemical attack,
or because it may be less easy for aggregated strands to
leave the favorable region. Aggregates may undergo selfreproduction. Suitable temperature and other environmental changes may cause their disassembly into component
strands. The individual strands may replicate and again assume folded conformations. They are thus able to diffuse
and again aggregate through favorable accidental collisions, hence, increasing the number of aggregates. All of
this requires a very specific and highly detailed program of
temperature changes that must be periodically repeated for
repetition of the process described. Convolution of strands
and aggregation are favored by cooling, disassemblage of
Angew. Chem. Int. Ed. Engl. 20, 500-520 (1981)
Fig. 6. a) Multiplication of aggregates by disassembly into strands, replication of the strands, and reaggregation. b) Schematic temperature variation required for strand replication. At the highest temperatures the strands are uncoiled. As the temperature decreases, internal base-pairing occurs and the
strands convolute, taking up e.g. the conformation of a hairpin. In this conformation, replication could only start at an end of the hairpin and progressed from there as the temperature rose and internal base pairing began to
weaken. The aggregation of convoluted strands and the later disassembly of
aggregates requires an additional superimposed cyclic temperature variation.
c) Realization of periodic changes in temperature by shadow-casting rocks in
a region of small linear dimensions (e.g . I mm). The rocks are immersed in a
broth of energy-rich monomers.
aggregates and deconvolution of strands by warming.
However, a myriad of cyclic temperature programs existed
on the primordial planet (Fig. 6), and it is almost certain
that an appropriate program was realized at least once, in
a region that need not have been larger than about 1 mm
across (see [’I, Section 18.1.4.2).
It is of importance that the component strands do not
diffuse too far from each other during the multiplication
phase, because they would be lost and reaggregation could
no longer occur. The phenomena discussed must therefore
have taken place in a confined space, such as in small
pores of a rock formation. The pore walls kept the strands
together during their diffusion, so they were able to locate
each other again and could form new aggregates. The rock
formation is presumed to be inundated by a solution of
505
suitable energy-rich monomers that could easily diffuse
through the pore channels, while the strands formed were
largely retained by them (Fig. 7). Neighboring pores are in-
Fig. 7. Pores with channels, into which monomers can readily diffuse while
strands capable of forming aggregates and serving as templates for replication are mostly held back.
vaded by strands, and the descendents of the aggregates
formed in the original pore slowly spread through the porous material. Detailed considerations lead to a pore diameter of about 500 nm o r 5000 A (see [’I, Section 18.1.4.2d).
For purposes of comparison, this is of the order of magnitude of a bacterium.
Aggregates such as those described, offer crucial selective advantages because they are capable of invading
larger pores than single strands. Additionally such advantages would result from machinery that facilitated the assembly of convoluted strands into aggregates. Both the disassembly of aggregates and their later reassembly must
have come about easily and quickly. Linear aggregates
consisting of essentiaIIy identical component parts are especially favored by these requirements, because they can
fall apart suddenly (as would not be the case for three-dimensional aggregates), and any one of the component
parts can reaggregate with any other one. The requirements would be met by the aggregates of hairpin strands
just discussed.
7. The Assembler
A large selective a- rantage would be inherent in machinery that speeds the assembly of convoluted hairpin
strands. A model of a simple mechanism, that would serve
this purpose, consists of an extra unfolded strand that has
become attached to one of the hairpin strands (Fig. 8, left).
The extra strand is assumed to be capable of acting as a
collecror strand that guides other hairpin strands to the location of growth of the aggregate; in essence, the collector
strand converts the spatial diffusion of folded strands to be
506
aggregated into a one-dimensional diffusion along the
open strand (Fig. 8, middle). They may, for example, migrate from one place of attachment on the collector strand
to the next. The carrier of the collector strand acts as an initial nucleation site in the formation of the picket fence-like
aggregate, serving also as its endpost (Fig. 8, right). The
orientation of the hairpin strand forming this endpost is
opposite to that of the other hairpin components of the aggregate.
Fig. 8. Mechanism facilitating the aggregation of hairpin strands. Left: The
hairpin strand on the far left is turned upside down and has an unfolded
strand attached to it that can function as collector strand. Right: Picket
fence-like aggregate formed in this way.
Fig. 9. Mechanism facilitating aggregation acts as an error filter. Top: Strand
with error that convolutes to faulty hairpin which diffuses along the collector
strand. Middle: Faulty hairpin does not fit into the aggregate and is rejected,
a correct copy taking its place (bottom).
Figure 9 shows, in detail, how the aggregate formation
by the assembler just described causes the all-important rejection of erroneous copies and in this way prevents the accumulation of replication errors. The consequence of this
error-filtering action is that the hairpin components of an
aggregate are essentially error-free. Without it, a hairpin
formed by accident would lose the information “hairpin”
within a few generations. The components of an aggregate
“co-exist” in this sense; they form a functional cooperative
that survives o r dies as a whole and that evolves as a n entity.
Angew.
Chem. Int. Ed. Engl. 20. 500-520 (1981)
In the proposed molecular model the hairpin strands are
attached to the collector strand by base-pairing of triplets
of complementary nucleotide pairs, as shown schematically (Fig. 10). The model exhibits an astonishingly precise
fit between neighboring hairpin strands (Fig. 11 and 12).
3’
3’ 5’
3’ 5’
(+)strand I-lstrand
(-hand
Fig. 10. Details of base pairing in aggregate being formed. Note attachment
of hairpin strands to collector strand by triplets of complementary bases.
Fig. 12. Details of the excellent fit between the different hairpin strands in a
picket fence-like aggregate and between the base triplets at the hairpin bends
and complementary triplets along the collector strand (inset). No equally
good fits seem to be possible with hairpin strands in the usual right-handed
helical conformation.
Fig. 11. Spatial arrangement of strands shown in Fig. 10. Hairpin ‘‘legs’’ are
twisted into double helices and outlines are as expected for van der Waals
contacts.
Equally precise is the fit among the stacked bases in the
triplets of complementary base pairs just mentioned. The
excellent fit shown in the figures appears to be ruined in
models in which the hairpin strands are attached by more
than three bases to complementary bases on the collector
strand. On the other hand, at least three nucleotides are
needed to form the 180” bend of a hairpin. Finally, there is
an excellent fit between the hairpin strand of opposite
orientation at the beginning of the aggregate and the first
hairpin strand adjoining it. All fits appear to require that
the 3’5’-strand directions are as indicated in Figure 10.
An important feature of this model is that both (+)- and
( - )-strands can be used as components. This feature permits great economy in the use of strand material and in the
Angew. Chem. I n r . Ed. Engl. 20, 500-520 (19811
I
(4Stmnd
3’
1-1 Strand
5’
Fig. 13. (+)- and (-)-hairpin strands. Middle: Upan completion of replication, with bases paired. Left: (+)-Strand in hairpin conformation. Right:
(-)-Strand in hairpin conformation.
assembly of aggregates. Except for the bases in the middle,
which are complementary to each other, the (+)- and (-)strands are identical (Fig. 13).
507
The nucleotide in the first position of the triplet at the
hairpin bend could, for example, be C in (+)-, as well as,
(-)-strands; in the third position it then has to be G in
both strand types. At the corresponding first positions of
the collector strand there would then always have to be G
and at the third positions C. The midpositions can be randomly occupied by G or C. If a midposition of the collector strand contains G, then a (+)-strand can attach itself,
and in the other situation a (-)-strand. In this way, a simple “reading frame” would be established. A “word” on
the collector strand would always begin with G and end
with C.
The bonding energies of base pairing are insufficient at
room temperature to establish stable bonding between two
triplets of bases. It is therefore reasonable to imagine that
a hairpin strand is first attached only very loosely to the
collector strand. Each newly acquired hairpin strand can
then move back and forth along the collector strand until it
reaches the growth region of the aggregate, where its attachment is stabilized by close fit to the hairpin strand preceding it on the aggregate. The model therefore permits the
postulated one-dimensional diffusion along the collector
strand. The new hairpin strand is firmly incorporated if its
triplet contains the appropriate bases that must be complementary to the corresponding bases on the collector
strand; it is discarded if this is not the case. Each newly incorporated hairpin strand provides additional stabilization
to the portion of the aggregate that already exists. If this
mechanism is to function as described, the binding energy
for the lateral interactions between a newly incorporated
hairpin strand and the one preceding it must be approximately equal to the binding energy of its base tripIet and
the corresponding triplet on the collector strand. If this energy were smaller, there would be no firm incorporation at
room temperature. If it were larger, any hairpin strand
would be firmly incorporated regardless of whether its
base triplet were complementary to the corresponding triplet on the collector strand or not.
The direction in which the triplets are read along the
collector strand is the same 5’3’-direction in which today’s
genetic blueprint, the strand of messenger ribonucleic acid,
is read. Similarly, the triplets of the hairpin strands of the
aggregate are in the opposite 3’5’-direction, as are the anticodon triplets of today’s transfer ribonucleic acids. It is
quite striking that the excellent fit of the hairpin strands in
the aggregate is just as one would wish from the model,
and it is therefore easy to imagine that the collector strand
is the primordial form of today’s messenger ribonucleic
acid. The hairpin strands would be the primordial forms of
the adapter or transfer ribonucleic acid molecules mentioned at the beginning. As mentioned earlier, they would
contain about the right number of nucleotides. The aggregates would represent the primordial forms of the translation apparatus.
The experimental realization of such aggregates is
of considerable importance, and the search for conditions (high ionic strength or suitable solvent) that would
favor the folding of strands of GC ribonucleic acid into the
hairpin conformation of our model, the Arnott-Rich-Dickerson left-handed double helix would be of interest“’. The
conditions (strand length and medium) must be chosen in
508
such a way that stabilization by the lateral fit of neighboring hairpin strands would be of exactly the correct magnitude. Under these circumstances formation of the described aggregates is predicted[**’.It can also be imagined
that the interlocking of just two hairpin strands and their
attachment to the collector strand results in an aggregate
of sufficient stability to facilitate further aggregation. The
nucleation center at the end of the collector strand would
then be unnecessary and could be omitted. Such systems
would also represent fruitful research targets.
The view that the collector strand is the primordial form
of the carrier of genetic information is supported by sequence analyses of the DNA of viruses, procaryotes, and
eucaryotes by Shepherd4’]. The clear periodic correlation
found, indicates that the reading frame PuNPy (h= purine, like G ; Py = pyrimidine, like C; N = purine or pyrimidine) originally existed, vestiges of which are still in
evidence.
The concept that the hairpin strands are the primordial
forms of transfer ribonucleic acids is supported by the
most recent studies of Eigen and Winklel’411,who have
found a great similarity in the sequences of the different
transfer nucleic acids they compared. These findings led
them to an ancestral sequence that is therefore derived
from empirical data. It exhibits a certain symmetry in the
sense that it permits a strand with this sequence to convolute into a hairpin conformation (or into the cloverleaf
conformation considered by Eigen). This result makes the
model so far described even more attractive, but the question still remains as to how the machinery described could
undergo the astonishing transformation into an apparatus
capable of translating nucleic acid sequences into amino
acid sequences or proteins.
8. Catalytic Activity of Aggregates
Let us proceed to develop the model further and see
where our considerations lead. How could the aggregates
T.M.Jouin of the Max-Planck-Institut
fur Biophysikalische Chemie, Gottingen, have found evidence of
a conformational change in G C ribonucleic acids, that suggests a
transition to a left-handed helix, as postulated. I n an aqueous solution of
GC-RNA, containing NaClO., and 20% ethanol, a concentration increase of
the salt from 4.8 mol liter-’ to 6 mol liter-’ causes a change in the sign of
to -), while the
the circular dichroism (DC) maximum at 284 nm (from
maximum is shifted to 294 nm. The C D spectrum of GC-RNA at the higher
salt concentration is very similar to the spectrum of the left-handed ArnotiRich-Dickerson form of GC-DNA. The change in conformation is associated
with a marked increase in tendency towards aggregation just as would be expected on the basis of our model (personal communication). In the methylated polynucleotide (poly(dG-mS dC) the transition into the left-handed
helix conformation can be induced at a Mg2+ concentration three orders of
magnitude lower than that required for the unmethylated polymer, i.e. close
to usual physiological conditions (M.Behe and G. Felsenfeld. Proc. Natl.
Acad. Sci. USA 78, 1619 (1981)).
[**I Current views of possible DNA and RNA double helix conformations
and their dependence on the nucleotide sequences are still in considerable
flux and further stable conformations may well be found (cf. e.g. the DNA
conformation recently proposed by Hopkins [39]). What is of importance to
our model is the close lit of neighboring hairpin strands, and it makes no difference whether a good fit can be achieved by left-handed or right-handed
double helices or by yet another conformation.
[*] Recently, J. H. uon De Sundeand
+
Angew. Chem. Int. Ed. Engl. 20. 500-520(1981)
consisting of collector strand and hairpin strands develop
catalytic properties for use in protein synthesis?
It was noted earlier that errors in the nucleotide sequence of a hairpin strand can affect its conformation and
in this way prevent it becoming part of an aggregate (cf.
Fig. 9). However, not all errors are important in this regard, only those that affect lateral contacts between neighboring strands in an aggregate. Errors at the hairpin ends
are of little or no consequence with regard to aggregation,
and these ends can easily open-up when replication errors
disturb the pairing of the first and last two bases (Fig. 14).
Fig. 15. Catalysis of polypeptide formation. Top: Hairpin ends exhibit an affinity towards activated amino acids that, once attached, can form polypeptide bonds. Bottom: The completed polymer is released.
Fig. 14. Hairpin strands in which the last few bases at the strand ends are not
complementary and for this reason are not paired. The open hairpin ends do
not affect the precision with which the hairpin strands fit together in the aggregate.
In fact, even in the earliest hairpin strands these ends
could have remained open: Open ends would only be of
help in permitting the start of replication at a strand end
(cf. Fig. 6b), and would impart selective advantages upon
such strands by preventing the confusion that would arise
if strand replication began simultaneously in several locations.
We now postulate that amino acids, linked to appropriate activating groups, became attached to these open
ends. The existence of amino acids on the primordial planet can readily be assumed, because they are easily obtained
in simulations of prebiotic conditions (Section 3). Once attached to the hairpin ends of an aggregate these amino
acids are assumed to readily form mutual polypeptide
bonds, being held in close proximity by the hairpin ends
and satisfying other steric conditions. The polypeptide
chains formed are then released and the entire process repeated (Fig. 15).
In our model, polypeptides consisting of glycine and alanine, the two amino acids that are produced in the largest
amounts in prebiotic simulations, bestow great advantages
upon forms producing them: Being hydrophobic and capable of agglomeration they formed impediments in pore
channels and in this way slowed down diffusion, making
possible the colonization of pores with wider channels
Angew. Chem. Inr. Ed. Engl. 20, 500-520 (1981)
Fig. 16. Polypeptides serving as impediments to diffusion in pore channels.
than before (Fig. 16). At a later stage, they coalesced into
some sort of “cellular” envelopes that again furthered the
propagation of forms producing them (Fig. 17). The envelopes presumably acted as nets, allowing mononucleotides
to enter and preventing nucleic acid strands from leaving‘]. This requirement can be fulfilled by agglomerated
polypeptides, but not by lipids, which therefore are expected to be useful and to become cell membrane constituents only at the time a specific semipermeability is required‘”’.
[*I The significant function of a polymer envelope as a barrier against diffusion has already been recognized by Opurin 1421, who investigated coacervates. Coacervates form a pre-existing structure, as is the case for the porous
rocks in the present model. Note, however, that in our model the polymer envelopes have the function of liberating the evolving system from the pre-existing porous structure. Their position in the logical framework of our model
is thus completely different from that of the coacervates in Opurin’s picture.
If one assumes (together with Opurin) that polypeptide envelopes were available even at the beginning, then no selective advantages would have accrued
in aggregates that might have accidentally arisen with catalytic properties towards the formation of envelope components. N o selective pressure in the
direction of envelope producing systems would have existed in such a situation.
[‘*I The polypeptide envelope could also be endowed with certain catalytic
properties towards the hydrolysis of nucleic acid strands, accelerating the degradation of erroneous copies that are not included in aggregates. The liberated mononucleotides would be available for strand synthesis and selective
advantages would be gained.
509
9. The Translation Apparatus
ore
Fig. 17. Polypeptides serving as “cellular” envelopes
Note that the formation of stable structures such as impediments and envelopes requires monomers, such as amino acids, with properties that differ from those of the
monomers that make up the strands. The reason is that
these structures must remain intact during the environmental changes that drive the replication of aggregates and
strands.
Any change that permits a liberation from the region of
small pores, increases the possibilities for multiplication
because the changed form does not face competition in the
new region. This effect, the advantage of the more complex
form in the colonization of new domains, leads to continued evolution in the direction of more complex forms. The
increase in refinement of evolving forms is a necessary
consequence of the variety and the multi-faceted wealth of
environmental conditions. Without such variety, no selective gradients would have existed and there would have
been no evolution.
1
The evolution of polypeptides, that serve as impediments and later as envelopes of increasing complexity,
leads to another important development. To see this, it
should be remembered that both (+)- and (-)-hairpin
strands, both of which are always present, can be built into
the aggregate side by side. The two strand types are identical except that they have complementary bases in the midposition and in corresponding positions at the ends of the
strands (Figs. 13 and 18). If the ends of a (+)-strand are
occupied by the bases G, for example, then there must be
bases C at the ends of a (-)-strand. The two kinds of
strands could then have different affinities for two kinds of
activated amino acids. At the same time, the two kinds of
strands have different bases in the midposition in the middle of their anticodon triplets. In this way an automatic
correlation would exist between amino acids and bases in
the middle of the anticodon triplet, and as a consequence
of this, a correlation between the sequence of nucleotides
on the collector strand and that of amino acids in the polypeptide. The hairpin strands would serve the function of
adapter molecules.
A possible explanation of how a specific linkage of an
amino acid to a nucleotide could have arisen is illustrated
for (+)-strands at the very bottom of Figure 18. The amino
acid a , is activated by a purine nucleotide (G) (cf. Scheme
I, step I). Intercalation and complementary base pairing
would then mediate specific linkage and place the amino
acid in such a position that reaction with the 2’-OH group
of the ribose of the terminal nucleotide of the strand could
occur. Our molecular model shows that the whole sequence of steps described would be sterically quite plausible, permitting reaction with the 2‘- as well as the 3’-OH
(-1 Strand
I*) S h n d
jiI
.
3’
5’
I
~
3’
5’
-
-&
Fig. 18. (+)- a n d (-)-hairpin strands a s adapters for amino acids a , and a2. Note that the two strands are identical expect for the nucleotides in the middle and
the ends of each strand. Top (left and right): Attachment of amino acids in highly schematic view. Bottom: possible mode of attachment of an amino acid a , that
has been activated by a purine nucleotide to the end of a (+)-strand.
510
Angew. Chem. Int. Ed. Engl. 20, 500-520 (1981)
group of the ribose in question. This would correspond to
the same linkage that is found in today’s charged tRNA.
Amino acid a2 could be activated in a corresponding way
by a pyrimidine nucleotide (C) and then linked to the terminus of a (-)-strand. A specific enrichment in activated
amino acids in special locations of the prebiotic planet
can readily be imagined“’.
The nucleotide sequence on the collector strand changes
slowly with time, and periodically a sequence can occur
that corresponds to a polypeptide exhibiting traces of enzymatic properties, which acts as a primitive “replicuse”,
i. e. as an enzyme that accelerates replication and reduces
the number of replication errors. This primitive replicase
would have to decrease the frequency of these errors sufficiently, so that the information coding for it would not be
lost during the number of generations required to fix it by
selection. Without replication, improvements of this kind
would be “forgotten” by the form within a few generations.
A quantitative estimate shows that these conditions can
be achieved even by quite weak enzymatic activity. For example, if the survival rate is increased by just 10% by the
advent of a “replicase” consisting of ten amino acids, a reduction of the replication error rate per base from 1/100 to
1/300 is sufficient for fixation by selection (see [*I, Section
18.1.4.3).
It can be imagined that such replicase activity can be
achieved, for example, by a short polypeptide that fits into
a notch of a double helix which exists in the region where
the new strand is formed during strand replication, and
which slips along in the notch, remaining in the region of
replication as the new strand is lengthened[-’! Its presence
would stabilize the double helix conformation during replication and improve the contact between the two strands
in that region. Replication would be accelerated and there
would be fewer errors in base pairing during replication.
If the conditions for the formation of the aggregates described were known, attempts could be made to produce
the postulated polypeptides from activated glycine and alanine or perhaps from two other amino acids. Once the
circumstances of aggregation and template-directed strand
synthesis are understood, they could be used to formulate
an appropriate periodic temperature program that could
[*] Aside from chemical differences that may cause one kind of amino acid
to be activated by a purine nucleotide and another by a pyrimidine nucleotide, physical differences between the activated molecules may also exist. It
is conceivable, for example, that there is little difference between the two
modes of activation of a given amino acid a t the site at which activation takes
place, but that separation of different species OCCUIS by a physical effect, e.g.
by differences in chromatographic properties. It can readily be imagined that
a mixture of activated amino acids is adsorbed on a substrate such as montmorillonite clay and then eluted by an aqueous solution of inorganic salts,
separate zones being formed with a,-G in one location, a,-C in another, etc.;
the zones serving as physically separated reservoirs of activated species. The
pores in which the crucial action takes place would then be in the vicinity of
suitable reservoirs, so that e.g. a,-C and a2-G would be able to trickle into
the porous region. It is easy to separate nucleotides in cationic as well as anionic exchange columns, using aqueous salt solutions as eluants [32].
[**I Investigations of specific binding of proteins on nucleic acids have shown
that a section in a polypeptide of, say, ten amino acids can form a twostranded antiparallel 8-sheet that fits into the groove of a double helix 1431. It
would be of great interest to search for Gly-Ala-polypeptides with, for example ten amino acids that facilitate a template-directed polymerization of nucleic acid strands and act as primitive replicases in this fashion.
Angew. Chem. Int. Ed. Engl. 20, 5W-520 (1981)
be used to direct the reproduction of aggregates, with a
view to the gradual appearance of a replicase.
The attainment of a replicase represents a major breakthrough; the development of machinery for reading and
translating a code. It permits the evolution of systems with
further enzymes. Progress could now proceed at a rapid
pace. Additional bases and positions in the codon triplet
became incorporated into the code and the code translator
became more sophisticated. No difficulties arose in this regard in computer simulations in which codes first for two,
and later for up to six amino acids were allowed to
evolve‘441. Cooperation develops between the different
components inside the envelope, and the whole grows into
a functional cooperative. The evolution of ever more refined replicases permits the genetic transmission of ever
larger amounts of information between parent and
daughter systems.
This, in principle, is the answer to the question posed
originally of how a translation apparatus, that consists of
translation products, could arise. In our model a very simple process is decisive; the astonishingly precise fitting together of building blocks consisting of a nucleation unit, a
collector strand, and hairpin adapter strands.
10. Further Details of the Primordial Translation
Process
Even the earliest concept of the present model of a
translation apparatus“’ at first featured the middle position
of the nucleotide triplet used in the attachment of the collector strand as the only position employed for coding. If it
is further assumed that, in the beginning, only the bases G
and C were used in strand nucleotides, then the first and
third bases of the anticodon triplets mentioned would always be G and C, respectively, or reversed C and G (because of the symmetry properties of (+)- and (-)-hairpin
strands discussed earlier). These two positions would thus,
at first only have been used to fix the “reading frame” for
code triplets. The actual code words for amino acids
would then either have been CCG and CGG, or GGC and
GCC. Eigen and S ~ h u s t e + have
~ ~ ] opted for the second
possibility, since GGC codes for glycine in all biological
organisms, and GCC for alanine, the two most prominent
amino acids of prebiotic chemistry. A reading frame of the
form PuNPy is also favored by Shepherd’s results mentioned in Section 7[401.After the invasion of the midposition by A and U, the triplets GAC and GUC became available. These triplets code for aspartic acid and valine, respectively, also among the most abundant amino acids in
simulations of prebiotic terrestrial conditions. Aspartic
acid is hydrophilic and it can well be imagined that with its
availability the development of polypeptides with enzymatic activity became possible.
Crick, Brenner, Klug, and P i e ~ z e n i k ‘have
~ ~ ~developed a
model of an early primordial translation apparatus, that
also consists of a strand of messenger RNA and adapters;
however, they have not indicated how such an apparatus
might have evolved. The difficulty that three base pairs are
insufficient for stable attachment of adapters to a messen-
51 1
ger strand is overcome in their model by postulating that
five base pairs are used in the bonding and are, moreover,
involved in a flip-flop mechanism actuated by the state of
bonding into a polypeptide of the attached amino acids.
(This model was discussed and modified by Eigen and
S ~ h u s t e r ’ ~There
~ ] . ) is no evidence that a similar mechanism
is involved in present-day ribosome^'^']. The difficulty that
three base pairs are insufficient for stable attachment is
overcome in our model by the inter-linkage of neighboring
adapters in the picket fence-like aggregate ;an interlinkage
that contributes decisively to the stability of this cooperative system of close-fitting components (cf. Section 7).
11. Ancestral and Primordial Sequences
The important results of Eigen and Winkle+411
on the sequence analysis of known tRNAs were used in Section 7
to support our model considerations. Their analysis established a probable common ancestral sequence of all transfer ribonucleic acids (Fig. 19). Two nucleotides U in posi-
having probably been added at a later time). Eigen and
Winklef4’1 attempted to derive an even earlier primordial
sequence from this ancestral sequence by assuming that the
template coding for the very first polypeptide was identical
with the original adapter subunit, the primordial transfer
ribonucleic acid. They thus obtained their primordial sequence by changing the ancestral sequence in such a way
that along the entire strand, read in the 5’3’-direction, there
is a sequence of guanine-N-cytosine triplets (where N denotes any one of the four bases). In Figure 19, these triplets
are indicated by brackets; purines (G or A) are shown by
filled, pyrimidines (C or U) by open symbols (cf. Fig. I).
Positions in which there is agreement between ancestral
and primordial sequences are marked by (+) symbols
while (-) symbols indicate disagreement. N o distinction
was made between different purines or between different
pyrimidines.
If we only distinguish between purines and pyrimidines,
we find 27 agreements (at 15 “first” and 12 “ t h i r d locations) and 19 disagreements, a difference that is considered
to be relevant to Eigen’s model. In our model a more accidental sequence of the complementary nucleotides along
the ‘‘legs’’ of the hairpin is expected, so that the deviation
from the average value should be close to that expected for
a statistical fluctuation. For a random sequence, agreement
would be expected in 23 cases with a standard deviation of
f
3.4. The value of 27, therefore, does not deviate
significantly from the average value. For a random distribution, the chance of finding the observed or a larger de-
m=
,=:,2
viation from the mean is 30% 2
I
I
I
R
I
’
I
’
‘-0
3
a)
b)
C)
Fig. 19. Ancestral and primordial tRNA sequences. a) Ancestral sequence derived by Eigen and Winkler from a consideration of known nucleotide sequences of tRNA, shown in hairpin conformation.Three nudeotidcs, in positions 60.61 and 65 have been deleted. b) Rimordial sequence of Eigen and
WinWrr (after [22D. c) Primal tRNA according to Hopfield 1481. Nucleotide
sequence is that of Val-tRNA of E. cdi. The primal nucleotide strand contains only hdf of the present-day tRNA nucleotide sequence, folded in a
manner that brings the 3’-terminus (charged here with an amino acid) into
proximity of the anticodon triplet.
tions 60 and 65 and one of the three nucleotides C in positions 61 -63 were deleted here to improve the complementary base pairing in the hairpin conformation (Eigen and
Winkler preferred to delete the C triplet, just mentioned, as
512
i:l/2-i-
There might nevertheless be a preference for the sequence PuNPy (where Pu =purine base and Py = pyrimidine base). To investigate this possibility, the frequencies
with which the sequences PuNPy, PuNPu, PyNPy, and
F‘yNPu should occur in a random arrangement, in which
15 of the 23 “first” locations are occupied by Pu and 12
“third” locations by Py, are calculated. The probability of
the sequence PuNPy is then 15 x 12/23* and those of the
other sequences 15 x I 1/232, 8 x 12/232, and 8 x I 1/232, respectively.
The situation, for all of the 23 triplets, is therefore as
follows:
hNPy
PuNh
PyNPy
PyNh
from Fig. 19a, b: 9
6
3
5
statistically 8 f 2
7+2
4+2
4f2
Again there is no statistically significant difference between the experimental values and those expected in a random situation.
A comparison of the sequences of 20 known tRNAs of
E. C O I ~ [ that
~ ~ ’ code for 14 amino acids has lead H ~ p j e l d ~ ~ ’
to another model for primal tRNA. It rests on the assumption that the strand conformation was such that the anticodon and the amino acid of the charged primal tRNA were
in direct contact (Fig. 19c). The nucleotide strand of the
proposed model only contains the portion of today’s
tRNA that runs from its acceptor terminus (to which an
amino acid can be bonded) to the anticodon triplet. The
Angew. Chem. In:. Ed. Engl. 20. 5W520 lI981)
conformation of the proposed primal tRNA would therefore be different from that of its supposed descendant and
the nucleotides shown in juxtaposition in Figure 19c
would nowadays be quite distant from each other. Hopfield
derives support for his model from a statistical analysis of
the nucleotide pairs in positions la, l b to 6a, 6b. If these
pairs were to exhibit more than a priori complementarity,
this would constitute evidence in favor of the posited model.
For a random occupation of the six pairs of positions
considered by the four kinds of nucleotides one would expect an average of 6 x := 1.5 complementary base pairs;
while observed values run as high as 5 in the example
shown in Figure 19c, the average over the 20 tRNAs is 2.5.
While Hopfield argues from a number of considerations,
that the difference between the values of 2.5 and 1.5 is statistically significant, it appears that a comparison with a
random sequence of nucleotides is not pertinent, because
the 20 observed sequences are far from random, as can be
seen from columns 2 through 5 of Table 1 based on the cob
Table I : Analysis of 20 tRNAs of E. coli [a].
Position
A
Number of tRNAs with
C
U
G
Number of base pairs
observed
statistical
la
Ib
20
12
2
6
0
6
Za
2b
3
20
3
4
10
10
3a
3b
0
11
1
8
'
4a
4b
10
I
8
2
6
5a
3
5
14
Sb
3
2
0
10
6a
6b
2
I
11
0
I
7
16
7a
7b
2
3
10
5
3
5
8a
8b
4
4
10
0
3
4
3
12
9a
9b
0
3
I1
1
3
5
11
1Oa
lob
4
0
7
6
2
12
1 la
Ilb
2
2
13
16
5
1
12a
12b
0
3
6
4
9
2
20
3
3
2
6.3k2.1
8.1f2.2
lo
9.6k2.2
5.7 f 2.0
6
7.4 rf 2.2
5
7
I
S
6.8 f 2 . 1
5.1 k2.0
1.3fI.I
5
11
5.7
* 2.0
bly from the average value of 20/4 = 5. It is evident that the
details of these occupancies are related to different nucleotide functions in the positions in question. What are the
probabilities of finding complementary nucleotides in the
pairs of positions of interest? From Table I the probabilities of finding nucleotides A, C, U and G in position 6a are
2/20, 11/20, 0, and 7/20 respectively, while for 6b the
probabilities are 1/20,2/20, 1/20, and 16/20, respectively.
The probability of finding complementarity is the sum of
the probabilities for the base pairs AU, UA, GC, and
CG:
If an event has an a priori probability p =0.48, then the result from 20 attempts is approximately Np = 20 x 0.48 =
9.6, with a standard deviation k d
m = +- 2.2.
The expectation value for the number of Complementary
base pairs is therefore 9.6k2.2. The statistical value is
close to the observed value of ten base pair complementarities. Analogous conclusions apply to positions 5a and 5b
as well as 4a and 4b (columns 6 and 7 of Table 1). If there
had been base complementarity in primordial tRNA in the
region concerned, statistical values significantly higher
than those observed would be expected. This is not the
case"].
While HopfieId limited his statistical considerations to
the pairs of positions 1 to 6, it is of interest to include the
additional pairs 7 to 12. Again approximate agreement is
found between the numbers of observed complementary
base pairs and the numbers based on statistical analysis
just described (Table I, columns 6 and 7). There is, however, one exception; the pairs of positions 12a, 12b for which
the number of observed complementary pairs is eleven, almost double the statistical value. The reason for this discrepancy is that in eight of the eleven present-day tRNAs
positions involved, 12a and 12b are actually paired, forming the beginning of loop I. Only the pairs 7a, 7b to Ila,
1Ib are therefore significant in this context. If Hopfield's
model were correct, significantly more compiementary
base pairs would be expected in these positions than corresponds to the calculated statistical values (Table 1, columns 6 and 7). In fact, for these five position pairs the
number of observed complementary pairs is smaller also
than would be expected for an equal a priori distribution of
the four kinds of nucleotides over the ten positions. From
Table 1 the total of observed complementary pairs is 23,
giving an average of 23/20= 1.15, while the expected avarage number of complementary base pairs is 5 x +=1.25.
In summary, HopJielcfs model cannot be supported by
means of the observed complementarities.
[a] In Met-tRNA the base in position fob is unknown, so that for the pair of
positions 10, the calculations include 19 rather than 20 cases. For purposes of
calculation the rare bases pseudouridine and 3-(3-amino-3-carboxypropyJ)uridine were counted as U and A"-methylguanine as G.
lection of nucleotide sequences by Barrel1 and
Thus, the three nucleotides at the 3'-terminus are always
ACC, and most other values in the table differ consideraAngew. Chem. Int. Ed. EngI. 20. 500-520 (1981)
I*]
The statistical lest used here cannot be applied to the pairs of positions I
to 3b, because positions la, 2a and 3a are always occupied by the same respeclive nucleotides in the sequence ACC. If HopfeMs model were correct,
one would expect positions Ib to 3b to be preferentially occupied by the sequence UGG. However, in position Ib, for example, A is found twice as
often as U.
513
12. DNA Takes Over the Storage of
Genetic Information
The example of the emergence of a translation apparatus illustrates the method of procedure in our model. Further important steps are considered here in a more cursory
fashion. A question that is initially extremely puzzling is
how and when the translation apparatus described was restructured into the genetic apparatus of today’s biological
systems, which have a completely different organizational
structure. The methodological program used previously
leads, however, to the view that a restructuring of precisely
this kind almost certainly had to occur.
Systems containing translation devices invaded regions
of ever larger pores and, related to these events, increased
in complexity and sophistication. However, the achievable
degree of complexity is limited at this level by congestion.
That is, while (+)-collector strands contain the codes for
useful polypeptides, the corresponding (-)-strands are
either useless for the synthesis of polypeptides or, even
worse, effect the production of nonsense polypeptides that
add to the confusion inside the envelope and use up amino
acids. These complications increase with the sophistication
and variety of devices that synthesize different polypeptides. An individual can therefore manufacture only a limited number of different enzymes, and a quiescent phase of
evolution is thus reached.
A possible way out is the following: The “replicase” can
sustain minor changes because of erroneous base couplings, and eventually “replicases” with slightly different
properties can evolve in the same envelope. Let such an
enzyme E, be one that favors, to a small degree, mononucleotides containing deoxyribose rather than ribose in the
synthesis of strands and E2 an enzyme that in turn favors
such deoxyribose-rich strands as templates in the synthesis
of strands rich in ribose. By chance let enzyme EZ be
slightly more efficient than E, (Fig. 20). The El-catalyzed
0
E.
bcc)
R;.
-
R;.
E,.E,.E,
D+
E2
2---”“
Ribose-rich
strands
-
-E,.E2
-Deoxy-
0-
u
R*
ribose-rich
strands
z
E2
Fig. 20. Emergence of deoxyribose-rich (D-)-strands. Short deoxyribose-rich
(-)-strands (D-) are synthesized on ribose-rich (+)-strands (R+),catalyzed
by enzyme El, while El catalyzes the synthesis of R + on D - . Enzyme E2 is
more efficient than E,.
replication of a (+)-strand of RNA (denoted by R + ) then
yields a deoxyribose-rich (-)-strand (denoted by D-) that
in turn can serve as template for the rapid production of a
number of copies Rf,assisted by the efficient enzyme EZ.
The strands R + can convolute into hairpins and aggregate
into enzyme-synthesizing devices. Thus, a few D- strands
now produce many R + strands. Directional selection oc514
N NNN
““FJ
,/
k 3
T
Fig. 21. Organizational structure of the processes that permit the separation
of the machinery for replication and enzyme synthesis. a) D - becomes
longer and serves as template for short strands R T , R ; , ...; a)schematic representation of the catalytic action of enzymes E, and EZ; 5) strands R: and
R : serve as templates in the formation of D-, catalyzed by enzyme E,. The
stretch NNNN is formed without the benefit of a template; y ) formation of
R: with D- serving as template, catalyzed by enzyme Ez. The NNNN portions of the D - strand are not replicated but serve as recognition regions
b) Enzyme E3 catalyzes the repliwhere replication is to begin @ and end 0.
cation of D on D - strands and vice versa; a)schematic representation of
the catalytic action of the enzymes E,, E2, and E,; p) formation of D + with
D- acting as template strand, catalyzed by enzyme E3. The portion
N N N N o n D + is formed by replication of the analogous portion NNNN
on D-. c) Enzyme E, is no longer needed; schematic representation of the
catalytic actions of the enzymes E2 and E,.
+
Angew. Chem. Int. Ed. Engl. 20, 500-520 (1981)
curs, favoring genetic machinery in which mechanisms for
replication and enzyme synthesis become separate. The
difficulty that arose through the accumulation of unproductive strands and of nonsense polypeptides is thus
solved by a reorganization of the system, and the appearance of a third sort of polymers rich in deoxyribose nucleotides.
The cooperation of enzymes El and E, permits, for a
while, growth of the functional cooperative by the emergence of additional enzymes. However, linked to the increase in complexity of this macromolecular society, there
is an increase in organizational problems, since the blueprints of the organisms are distributed over many R +
strands ( R T , R l ...). The difficulty can be overcome by
combining the complementary strands D;, D; ... into
fewer longer strands and eventually into just one long D strand (Fig. 21a a) that then serves as template for new
strands R:, R: ... . Apart from the decrease in organizational problems thus brought about, there is also less tendency for a long strand to escape by diffusion.
The process of lengthening the D - strands would be
aided by an enzyme acting as “Iigase”. It can be imagined
that enzyme El, slipping along as replication proceeds,
protrudes from its slot as the end of the strand is reached.
The protruding end may then assist in the occasional attachment of a second strand that happens to be nearby.
The result is a D- strand, that is lengthened by the new
piece, which contains the information of the two R +
strands separated by a short piece of strand NNNN that
was polymerized template-free (Fig. 2 l a p)[’].
Enzyme E,, involved in the synthesis of the short strands
R:, R:, ..., must contain recognition sites for the locations
along the D- strand at which replication of R + strands
must start and end. Perhaps the short pieces along D- that
arose through non-template polymerization can act in a
special way (Fig. 2 I a, y).
A further necessary step is the modification of an earlier
“replicase” into an enzyme E3 (Fig. 21b, a)that catalyzes
the formation of D + strands from D - templates and the
converse. Since only D - should be used as template for
producing strands R:,RZ,.. ., enzyme E, should be able to
interact with the recognition sites of D- only, which
would therefore have to be special. They might e. g. consist
of the sequence NNNN, to which the “non-recognition”
piece N’N’N’N’ on D + would correspond, where N’ is the
nucleotide complementary to N (Fig. 21b p).
Such a system of enzymes would confer great selective
advantages and initiate the evolution of machinery for
replication operating with increased efficiency and accuracy. The mechanism would lead to a situation in which
DNA strands became the carriers of information. El would
degenerate and E2 evolve into “transcriptuse” (Fig. 21c).
(Since inverse transcriptase is endowed with the same function as E, it is conceivable that it evolved from such a precursor.)
At an early stage, another apparatus must have evolved
that improved the contact between the collector strand and
It is known, that under certain conditions QB replicase can cause the template-free replication of strands [SO].
[*I
Angew. Chem. In&.Ed. Engl. 20, 500-520 (1981)
the hairpin adapters, suppressing “reading” errors. This
surmise is strongly suggested by the fact that today’s ribosomes are the only functional cell elements (enzyme systems) that are largely composed of nucleic acids and may
therefore have evolved from such an apparatus.
At this stage all essential elements of the present-day genetic machinery, as known from molecular biology, have
been accounted for and their evolution appears to be a necessity.
13. Exchange of Genetic Information
As the systems considered grew ever more complex, the
information passed from one generation to the next increased. Let W be the probability that during replication
an error occurs in the choice of a new base, so that the
base incorporated into the daughter strand is not complementary to the corresponding base on the template strand.
This probability must consequently decrease as the number, NtOtaL,
of monomers involved in information storage
and transfer increases, and it can be shown that the relation N,,,,, W = 1 must approximately be satisfied (cf. [’I,
Section 18.1.4.7). This condition theoretically assures that
there is a sufficient number of error-free copies in each
generation so the form does not become extinct. There is
also some experimental evidence for this approximate relationship. It has, for example, been found by Weissmann et
that W - 1/3000 for Qp-bacteriophage, for which the
number of nucleotides is 4500.
As N,,,,, increases, W must decrease, e. g. by the appearance of improved “replicases”. However, a new difficulty
arises, caused by the requirement that there must be a certain minimum error frequency to permit the continuous
adaptation of the systems to an ever-changing environment, when N,,,,, reaches approximately lo6. The required
approximate value of W is
Once the corresponding
approximate value of N,,,,, = lo6 has been reached, it becomes impossible to increase N,,,,, further without violating one of these two limiting conditions (Fig. 22). A phase
W
t
Fig. 22. Conditions for W. The value of lo6 on the horizontal axis represents
an upper limit for Nlob,,.Gray region is in violation of the two boundary conditions described in the text.
of stagnation is reached in the model and a new fundamental organizational change, a new breakthrough is required to overcome this barrier.
515
In the model, this breakthrough consists in the exchange
of genetic information between individuals. An evolution
of sexual machinery that permits the mutual exchanges of
pieces of strand takes place, z. e. there is recombination of
genetic material that may confer advantages upon the reis
cipients. The approximate requirement that W =
thereby relaxed, and N,,,,, can grow, as long as accidental
improvements in the replication machinery lower W to a
level at which the condition N,,,,, W = 1 is again satisfied.
The gain in flexibility through recombination must, of
course, outweigh the loss caused by the decline in the replication error rate. An improvement of the mechanism for
the recombination of genetic material and a decrease in W
and increase in N,,,, are thus strongly coupled to each
other.
The stage of information under discussion corresponds
approximately to that of a bacterium, for which the total
number of nucleotides in the DNA lies between 3 x lo6
and 6 x lo6, only a fraction of which carry information
coding for proteins, i.e. constitutes N,,,,. The portion of
DNA that carries such information is probably quite large
in bacteria, so that a value of about lo6 for N,,,,, would
that
seem justified. For E. coli it is found experimentally[521
W = lo-*; indeed E. colisometimes exchanges genetic material by conjugation.
The interchange of genetic material strongly accelerates
the accumulation of genetic information and therefore determines the direction of further evolution. At the stage
considered, a division into organisms that remained simple
and others that became more and more complex must have
occurred. Simplicity of organization (non-development of
sexual machinery or limitation to very primitive such machinery) yields selective advantage in some ecological
niches-the organisms remained at the stage of procaryotes.
A decisive improvement in the sexual machinery presupposes an increase in complexity. The increase in sophistication caused organizational problems that could onfy be
solved by the subdivision of the cell interior, hence improving the regulation of molecular traffic. The cell architecture became more complicated-eucaryotes evolved. in
regions, in which the procurement of nutrients required the
development of more and more complicated devices, only
further structuring by multicellularity was left as a means
of escape into new living spaces.
The question of the emergence of primitive sexual mechanisms, a necessity in the model considered, must not be
confused with the question, commonly discussed by evolutionary ecologists, of the present-day selective forces maintaining sexual reproduction. Asexual reproduction must
have repeatedly arisen in different evolutionary lineages as
a derivation of primitive sexual reproduction.
14. Thermal Barriers to Information Storage and
Transfer
It is not possible, however, to increase N,,,,, beyond a
certain limit, because new difficulties arise in the modet
The new bamer arises because errors caused by ever-present thermal fluctuations cannot be avoided, a fact that im-
516
plies the existence of a minimum value below which W
cannot be decreased.
This bamer can only be hurdled once machinery is
available that uses larger than molecularly dimensioned
symbols and in this manner permits the storage of much
greater quantities of information than is possible by the genetic apparatus. Such machinery is available in written
language, and more recently, in computers. Information in
artificial memories, such as instructions for the manufacture of some device, is transmitted over many generations
and modified and supplemented by material that is subject
to a selection process. The revolutionary breakthrough of
artificial storage systems is comparable to the revolution
associated with the separation of the machinery for replication from that for protein synthesis, during which DNA
became the receptacle of genetic information. As then, a
change in the system of information transmission led to a
huge increase in the information that could be transferred
from one generation to the next.
Liberation from environment
t
Ar t h c to I memory
Sophisticated conceptual structures
Third k m d of
polymer 1 DNA )
Prtmi ttve conceptual structure
Recombination and sex
w = i P ,,N
SophtStiCQted enzymes
W=
N~~~~
,106
Second ktnd of
polymer Ipclypeptides)
Primitive enzymes
w=w3
Obstructtons a n d envelopes
First ktnd cf
polymer ( R N A )
Repltcatrng s t r a n d s
Aggregates
,,N
,
W =lo2 , , N
,
= lo3
330
1
Highly specific environment
fig. 23. Increased complexity of infomation storage led to increased independence from specific environments.
Some of the breakthrough phases described are summarized in Figure 23. This depicts, in our view, the main line of
evolution-a sequence of many minute steps that lead to
an ever-increasing liberation from an environment that
was, in the beginning, very special. This development is related to a large increase in complexity, of which Ntotalis a
measure, and to a decrease in the probabiIity W of replication errors. At first only one kind of polymer, ribonucleic
acid, was of importance. Replication set in, followed by
the formation of aggregates. A second kind of polymer, the
polypeptides, became important. Envelopes developed
and a genetic translation apparatus evolved. A third kind
of polymer, deoxyribonucleic acid, permitted, the complete reorganization of the system of information transfer.
A refined translation apparatus evolved and sexuality became important. Finally, artificial memory systems became
carriers of information.
While individual steps can always be replaced by others
that are somewhat different, without thereby destroying
the logical connections between the steps, the fundamental
Angew. Chem. Inr. Ed. Engl. 20, 500-520 (1981)
of erroneous copies
Replication of short
molecular sfrands
Aggregation of favorably
folded molecular strands
of building blocks
Preservation of information
needed to form
translation products
Aggregates produce
envelope -forming
Envelope molecule
acts as replicase
of complex systems
of information
by thermal noise
nonsensical products
Separation of machinery
for replication and for
transiation
Artificial memories
Sexuality
Fig. 24. Organizational framework of the evolution process. Logical requirements (boxes) and their realizations.
changes in the system structure of the explored model
seem to be fixed necessities. Under appropriate circumstances, a system appears that is capable of learning (Fig.
24, step 1). The complexity increases but stagnation sets in,
because of the accumulation of replication errors. To overcome it, machinery for the discarding of erroneous copies
is needed and is achievable (step 2). The complexity of the
systems again increases, until stagnation occurs, because
they are tied to pre-existing compartmentation. To conquer
this barrier machinery is needed and achievable that permits a containment of building blocks that is independent
of external structure (step 3). Given appropriate conditions
this machinery develops, by necessity, into machinery for
code translation and machinery that permits the preservation of information needed to form translation products
(step 4). This development in turn allows an increase in
complexity until stagnation sets in at a certain level, because of the accumulation of nonsense products. The conquest of this hurdle (step 5) requires machinery that permits the bypassing of meaningless production and can be
achieved by reorganization of the systems. Further evoiution leads to increasingly complex, and thus ever more delicate systems, ending again in stagnation because of inadequate adaptability of the systems. The barrier can again
be hurdled by a fundamental reorganization of the systems
(step 6). This change permits the achievements of a level of
complexity that cannot be raised further because of thermal noise, until another fundamental restructuring of the
organization permits a conquest of this barrier (step 7).
15. I
g
e as a Measere of the Usefalwss of
Infonnatiom
The process of evolution can be described as a phenomenon that, at some place and at some time in the develop
ment of any suitable planet (Fig. 251, starts with a bang. All
of a sudden a strand arises that can replicate, thereby
creating a system that adapts itself again and again to a
changing environment, in many steps of replication and selection. This process of evolution proceeds in jumps:
Nothing of importance happens for long periods, until a
Angev. Chem. Int.
Ed. Engl. 20, 500-520 (1981)
reorganization of the system ushers in a rapid new development. In a corresponding way there are discontinuous
increases in the contents of the message of which the systems are carriers. An important function that is a measure
of and increases with the degree of the evolution of the
systems will be termed the knowledge of the systems. We
roughly define this function as the information (measured
in bits or yes-no decisions) contained in the totality of the
blueprints that had to be rejected by necessity until the
stage of evolution considered was reached (see 1531and ['I,
Section 18.1.5). Knowledge is a measure of the usefulness
of the information accumulated during the course of evolution. (Shannon's well-known measure for information[s41
represents the arnoxltt of information, the number of bits,
and not its usefginess.)
t Knowledge
d
A
-Time
A
Appearance
of suitable
planet
Appearance
of systems
capable of
ad aption to
environment
Fig. 25. Emergence of life. Learning systems suddenly appear, without prior
traces of this new quality. Knowledge of evolving systems increases discontinuously.
16. Comparison with Another Approach
Our model differs in a cardinal point from an often-presented viewpoint-adhered to, for example, by PrigogindSs1,
Eigeri"61,and others["-in
which it is asserted that a
model for the origin of life can be described in terms of the
spontaneous formation of structure in a solution far from
equilibrium, to which suitable reactants are supplied and
reaction products removed in a steady-state situation.
In Eigen's view a translation apparatus, that is, an apparatus in which tRNA molecules can attach themselves to a
517
strand of messenger RNA, serving as adapters can spontaneously appear in a suitable solution of nucleotides, the result being the synthesis of polypeptides that can act as “replicase”. Here, this apparatus must form before an effective mechanism for the removal of replication errors has
become available. After Eigen and Schuster (see I4’], Section XV therein), such a filtering mechanism is only
formed later by the cyclic coupling of two or more reaction
cycles. For example, in the case of two cycles 1 and 2 that
produce the replicases R, and Rz,respectively, cycle 1 is
catalyzed by RZ and cycle 2 by R,, so that cooperation between the two cycles exists in this way. The resulting hypercycle then dominates the solution.
In our opinion, the fundamental enigma is the mechanism that caused the emergence of a translation apparatus.
The degree of intricacy i s such, that a device of this kind
could only have developed after the evolution of an effective mechanism for the removal of replication errors.
The only conceivable error-filtering mechanism (rejection of erroneous copies in the formation of an aggregate)
requires pre-existing external spatial and temporal structure as sine qua non for the coming together of convoluted
strands-for their aggregation, for the disassemblage of aggregates and replication of aggregate components, and for
the correct reconstruction of functionally new, and at times
improved, aggregates.
Our conception is in conflict with the view that a fundamental event in the origin of life was the spontaneous appearance of structure, caused by an inner instability (an
occurrence that is the basis of many natural phenomena).
Instead, we consider the question of how life originated
primarily as a problem of finding the logical framework
and organizational structure of evolutionary processes: By
what principles can systems emerge that are capable of
learning? What fundamental barriers confront such systems that can learn once they exist? What fundamental
possibilities exist to overcome these barriers? Secondfy we
consider it as a problem of physical chemistry: What fundamental possibilities satisfying the organizational requirements exist in physical chemical models, that is, by remaining within the framework of physical and chemical
laws? Given such a starting position questions about the
thermodynamic conditions for the appearance of dissipative structures in a homogeneous solution, in a stationary
state far removed from equilibrium, do not arise. Instead,
pre-existing spatial and temporal structure is recognized as
a fundamental requirement, and therefore a homogeneous,
stationary system cannot serve as a suitable starting point.
Thus, one does not ask for the conditions for the existence
of a cyclic reaction sequence that must be satisfied everywhere in the solution, but for conditions that must prevail
in a specific location so that aggregates consisting of a
small number of macromolecules as components can appear and multiply; such aggregates have the qualities of a
simplest translation apparatus, and can develop, by mutation and selection, ever more complex capabilities in their
interaction with a multifaceted environment.
With this initial position, the appearance of a translation
apparatus contributes the decisive breakthrough: A mechanism for the production of ever more complex enzymes
has become available. A cyclic coupling of systems that
518
produce replicases contributes nothing at this level. Cooperation arose earlier in our model, with the formation of
aggregates, while in Eigen’s approach the first appearance
of cooperation is the mutual interaction of cycles that produce replicases. As we have seen, a certain cyclic coupling
between replicase-producing systems becomes important,
in our model, at a later stage of evolution when deoxyribonucleic acid began to play a role. The type of coupling developing at that time is different from the one involved in
Eigen’s hypercycle.
According to Eigen, hypercycles are the only possibile
way of surmounting the informational crisis that arises
through the proliferation of replication errors, because
they simultaneously satisfy the following three condition~[~~]:
1) Each replicative unit must selectively maintain its information content in competition with its own error distribution.
2) Competition between replicative units that belong to
the same functional cooperative must cease to operate.
3) The functional unit as an entity must be capable of
competition against alternative units.
These three conditions are, however, also satisfied by
the systems, which we have considered; i.e. molecular
strands that form aggregates (condition 1, because erroneous copies are rejected by the aggregate; condition 2, because of cooperation among the components of the aggregate; and condition 3, because of the special survival properties of the aggregate as a whole). On the other hand, this
system does not constitute a hypercycle. In the case of a
single kind of strand (for example, functionally equivalent
(+)- and (-)-hairpin strands), autocatalytic replication of
a kind of strand simply occurs, while aggregation serves to
increase its viability.
The cardinal problem concerning the origin of life is
how one can imagine the development of the simplest systems that could adapt to their environment-systems consisting of a few macromolecules capable of mutual cooperation. An essential requirement for this process is the existence of a spatially and temporally structured environment, which is needed to keep the potential building
blocks of such systems from diffusing away from the region of importance and to drive the replication, assemblage, and disassemblage of the aggregates. The presence
of structure is thus the condition for the appearance of life
by molecular self-organization[*’,that is, for the build-up
of the information content of a learning system. How such
temporal and spatial structure arose on the primordial
planet is a question to be answered by planetary science.
17. Answers to Frequent Questions and Questionable
Statements Concerning the Origin of Life
Questions and erroneous statements one often hears are
the following:
[*I In our usage of the concept of self-organization, we exclude extra-physical influences.
Angew. Chem. Int. Ed. Engi. 20, 500-520 (1981)
1) The geological time span available for the appearance of
even the simplest forms of life was insufJicient, thus requiring seeding of the planet or extra-physical phenomena
to explain the emergence of life.
Evolution up to the period at which genetic machinery
appeared must have occurred rapidly, because of the large
error frequency that dominated the early stages. It was
most rapid at the beginning, until the frequency of errors
was enormously decreased by the formation of aggregates.
The development of a genetic apparatus again enormously
decreased the error frequency in the correct pairing of
bases. The time requirement for all steps up to the development of the genetic apparatus must therefore be small
compared to the time required for the instruction of the
about 1000 proteins of a bacterium. We can estimate the
latter time using the assumption that proteins are inserted
consecutively into the functional cooperative of the form
existing at the time, and through this the DNA strand
lengthened by lo3 nucleotides for each new protein.
The nucleotide sequence on the newly added piece of
strand changes by random errors in the matching of bases,
and in this way the protein adjusts to its appropriate function. For simplicity we assume that each protein is instructed by approximately 100 optimization steps, and that
between each of these steps the situation must be awaited
in which a random distribution of the occupancies of the
non-instructed locations of amino acids, produces yet another step in the optimization. This implies 1/W= lo6 generations per step (one base change has occurred on the average in each position during this period) or lo6x 10' generations per protein, and therefore a total of lo6x lo2= 10'
generations. Following this, the DNA strand is again
lengthened by a piece of lo3 nucleotides and the process
repeated. The lo3 proteins therefore require about
10' x lo3= 10" generations. This amounts to approximately 10' years, assuming one day per generation; compared to the approximately lo9 years available for the
process in geological history.
The reason for making this estimate is to answer the
question posed and is not intended to give accurate information about the time needed for the evolution of a bacterium. It may well be an overestimate by an order of magnitude, because all proteins appear to consist of about 100
domains that were independently instructed and then developed further by gene duplication'591.
2 ) The probability of the spontaneous emergence of the simplest systems that could develop into living organisms is
much too small for a physical explanation of this phenomenon.
Here, it is important to consider the probabilities of very
many minute and detailed steps. For a small number of
larger steps the probabilities soon become infinitesimally
small. This can even be seen in the simple case of the spontaneous appearance of the first self-replicating strand, consisting of monomers that became accidentally linked.
Each monomer must contain the correct sugar, correctly
linked to a nucleotide base and to a phosphate group.
Angew. Chem. In!. Ed. Engl. 20, 500-520 (1981)
The probability of this is about 1/100, or around about
l/lOO)'o=
for a strand of ten monomers. This value
is within acceptable limits, as indicated in the earlier dicussion. However, for a strand of 50 members the corresponda value so small that it
ing probability is (1/100)50=
would take an entire universe filled with strands, at a density of one strand per nm3, so that just one of them could
be assumed to be accidently correct. That is, for all practical purposes, the probability that such a strand could have
spontaneously been formed is zero. Our considerations
imply that in general relatively large steps do not occur.
3 ) Biological systems behave in a holistic, purpose-oriented
fashion that is in conflict with the way in which systems
describable by physicar laws behave.
In fact, no conflict exists with the laws of physics, because the described behavior is the result of the survival of
the systems considered in an environment in which survival may be difficult, while a large number of slightly less
suitable systems were rejected. It is the result of the wellknown mechanism of evolution: multiplication, mutation,
and selection.
Replication errors are disadvantageous in most cases,
but lead in rare cases to an improvement of the survival
chances of the changed form. Those forms, that are better
adapted to the surroundings, remain. Biological evolution
leads in this way to an ever improved adaptation to the
surroundings. It represents a learning process that extends
over great numbers of generations. Adaptation and learning, however, are in essence holistic, purpose-directed
processes that are therefore seen to be derivable from physical laws.
18. Concluding Remarks
The methodology followed in our approach to the origin
of life is intended to overcome the mental difficulties in
understanding this astonishing phenomenon on a physical
basis. The many small model steps that are considered
have not been devised with the hope of describing the actual path followed by nature, but rather with the intention
of understanding the fundamental aspects and difficulties
as tangibly as possible. It is therefore remarkable that, using this method, a successful description of the development of systems endowed with a genetic apparatus is possible, as a consequence of many small plausible steps that
follow each other with an almost inescapable inner logic.
Given sufficient time, all of the individual steps have probabilities near unity. Even more remarkable is that one of
the results of these considerations, is a detailed description
of a molecular model of this apparatus. In many details,
including the possible hairpin conformation of the adapter
molecules, this model is identical with that described some
years ago['], at a time when many of the experimental results used here were not as yet known.
Received: February 17, 1981;
supplemented: April 3, 1981 [A 354 IEl
German version: Angew. Chem. 93,495 (1981)
519
111 H. Kuhn. Angew. Chem. 84,837 (1972); Angew. Chem. Int. Ed. Engl. 11,
798 (1972).
121 H. Kuhn, 1. Waser in W. Hoppe. W. Lohmann. H . Markl. H . Ziegler:
Biophysik - Ein Lehrbuch, 2nd Ed., Springer, Heidelberg 1981.
131 H . Kuhn. D. M6bius. Angew. Chem. 83,672 (1971): Angew. Chem. Int.
Ed. Engl. 10,620 (1971); D. Miibius. Acc. Chem. Res. 14.63 (1981).
141 S. L. Miller, H. C. Urey. J . Ord. J. Mol. Evol. 9, 59 (1976).
151 G. Toupance, F. Rauling, R. Buuet. Origins Life 6, 83 (1975).
[6] A. W. Schwartz in E. K. Duunma. R. Dawson: Marine Organic Chemistry, Elsevier, Amsterdam, cited by A. Henderoon-Sellers,A. W.Schwartz,
Nature 287. 526 (1980).
171 1. Oro. Nature 191, 1193 (1961).
[El 1. f. Ferns, J . E. Kuder, A. W. Catahno, Science 166, 765 (1969).
19) 1. P. Ferris. P. C. Joshi. E. H. Edelson. 1. G. Lawless. J . Mol. Evol. 11,
293 (1978).
I101 1. P. Ferris. R. A. Sunchez. L. E. Orgel, J. Mol. Biol. 33, 693 (1%8).
I111 N. W.Gabel, C. Ponnamperuma. Nature 216,453 (1%7).
1121 E. Anders. R. Hayutsu. M. H. Sfudier. Origins of Life 5, 57 (1974).
1131 S. L. Miller. L. E. Orgel: The Origins of Life on the Earth, Prentice Hall,
Englewood Cliffs 1974.
1141 W. D. FuUer. R. A . Sanchez, L. E. Orgel, J . Mol. Evol. 1,249 (1972).
[I51 L. E. Orgel, R. Lohrmunn. A=. Chem. Res. 7,368 (1974).
1161 R. Lohmrann, L. E. Orgel. Nature 244, 418 (1973).
1171 H . L. Sleeper, L. E. Orgel, J. Mol. Evol. 12,357 (1979).
[IS] R. Lohnnann. L. E. Orgel, J . Mol. Evol. 12, 237 (1979).
[I91 J . Ninio, L. E. Orgel. J. Mol. Evol. 12 91 (1978).
1201 H . L. Sleeper. R. Lahrmann, L. E. Orgel. J. Mol. Evol. 13, 203 (1979).
R. Lohnnann. P. K. Bridson,
1211 L. E. Orgel, unpublished, citated from [U];
L. E. Orgel. Science 208. 1464 (1980).
1221 f. Schuster in H . Gutfreund: Biochemical Evolution, Cambridge University Press, Cambridge 1980.
1231 M . Paecht-Horowitz. J. Berger. A. Katchalsky, Nature 228,636 (1970).
[24] A. Katchalsky. Natunvissenschaften 60, 215 (1973).
[25] M. Paechf-Horowitz.J . Mol. Evol. 11, 101 (1978).
1261 K. Dose. H . Rauchfuss: Chemische Evolution und der Ursprung lebender Systeme, Wissenschaftliche Verlagsgesellschaft, Stuttgart 1975.
1271 R. W . Kuplan: Der Ursprung des Lebens, dtv/Thieme, Stuttgart, 2nd
Ed. 1980.
1281 C. Ponnamperuma: Exobiology, NorthHolland, Amsterdam 1972.
[29] K. A. Koenvolden, 1. G. Lawless, C. Ponnamperuma, Roc. Natl. Acad.
Sci. 68,486 (1971).
1301 R. A. Ken. Science 210.42 (1980).
1311 H. Wiinke in: Evolution der Planetenatmosphlren und des Lebens, 2.
Deutsche forschungsgemeinschaft - Kolloquium iiber Planetenforschung. 1979, p. 198.
1321 E. Hefimannt Chromatography, 2nd Ed., Reinhold f’ubl., New York
1%7, p. 636ff.; E. Stahl: DCinnschicht-Chromatographie, 2nd Ed.,
Springer, Berlin 1967, p. 758ff.
1331 A. Rich et a/.. Science 179,285 (1973).
520
[34] D. R. Kearns. R. G. Schulman, Acc. Chcm. Res. 7,33 (1974).
1351 A. H. Wang, G.J. Quiqley. F. 1. Kolpak. J . L. Crawford. J . H.uon Boom,
G. can der Marel. A. Rich, Nature 282,680 (1979).
1361 H. Drew, T. Tabno. S. Takuno, K. Itokura. R. E. Dickerson. Nature
286, 567 (1980).
1371 S. Amoft. R. Chadrasekaran. D. L. BirdsaU. A. G. W. Leslie, R. L. Raf/if, Nature 283, 743 (1980).
1381 F. M. Pohl. T. M . Jmin, J. Mol. Biol. 67, 375 (1972).
1391 D. 1. Patel. L. L. Canuel. F. M . Pohl. b c . Natl. Acad. Sci USA 76,2508
(1979); R. C. Hopkins, Science 211,289 (1981).
1401 J. C. W . Shepherd, J . Mol. Evol., in press; Roc. Natl. Acad. Sci. USA 78,
15% (1981).
I411 M . Eigen, Max-Planck-Gesellschaft,Jahrbuch 1979, Vandenhoeck &
Ruprecht, G&tingen, p. 17; M. Eigen, R. Winkler, Naturwiss. 68, 217
(1980).
I421 A. I. Oparin: The Chemical Origin of Life, Charles C. Thomas, Springfield, Ill. 1964.
[431 C. W. Carter, 1. Krauf, Roc. Natl. Acad. Sci. USA 71,283 (1974); G. M .
Church. J. L. Sussmann. S. H . Kim,ibid. 74, 1458 (1977); W.F. Anderson, D. H . Ohlendorf, Y. Takeda, B. W. Maffhews.Nature, in press. We
are grateful to Dr. G. Eichele, of the Biozentrum der Universitat Basel
(Switzerland), for drawing our attention to these references.
1441 H . Kuhn. Ch. Kuhn. Origin of Life 9, 137 (1978).
(451 M. Eigen, P. Schuster. Naturwissenschaften 64, 541 (1977); 65, 7, 341
(1978): The Hypercycle, A Principle of Natural Selforganization,
Springer, Berlin 1979.
1461 F. H . Crick, S. Brenner. A. Klug, G. pieczenik, Origin of Life 7, 389
(1976).
(471 A. Malzke, A. Barta. E. KiichIer: On the Mechanism of Translocation:
Relative Arrangement of tRNA and mRNA on the Ribosome, in press,
cited in 1221.
1481 3. J. Hopfield, Proc. Natl. Acad. Sci. USA 75,4338 (1978).
1491 B. G. Banell, 8. F. C. Clark: Handbook of Nucleic Acid Sequences,
Joynson-Bruwers Ltd., Oxford 1974.
1501 M.Sumper, R. Luze, Proc. Natl. Acad. Sci. USA 72, 162 (1975): 8. K i i p
pers, M . Sumper, ibid. 72, 2630 (1975).
[Sll E. Domingo, R. A. FlawIl, Ch. Weissmann.Gene I, 3, 27 (1976).
1521 E. C. Cox, C. Yanofky, Roc. Natl. Acad. Sci. USA 58, 1995 (1%7).
[531 H . Kuhn. Ber. Bunsenges. Phys. Chem. 80. 1209 (1976); in H . Hnken:
Synergetics, a Workshop. Springer, Heidelberg 1977, p. 200.
1541 C. E. Shunnon, W. Weaver: The Mathematical Theory of Communication, University of Illinois Press, Urbana 1949.
[55] 0. Nicolis. I. Prigogine: Self-Organization in Nonequilibrium Systems,
Wiley-Interscience, New York 1977, Chap. 7.
1561 M . Eigen. Natunvissenschaften 58,465 (1971).
1571 R. Riedl: Biologie der Erkenntnis, 2nd Ed., Parey, Berlin 1980.
1581 M. Eigm, Angew. Chem. 93,221 (1981); Angew. Chem. Int. Ed. Engl.
20,233 (1981).
1591 C.Schulz, Angew. Chem. 93, 143 (1981); Angew. Chem. Int. Ed. Engl.
20, 143 (1981).
Angew. Chem. Int. Ed. Engl. 20. 500-520 (1981)
Документ
Категория
Без категории
Просмотров
3
Размер файла
2 943 Кб
Теги
self, molecular, origin, organization, life
1/--страниц
Пожаловаться на содержимое документа