Discuss the similarities and differences between genetic
algorithms and natural evolution.
Greg Detre
Animal Behaviour IV
Richard Dawkins’ The Blind Watchmaker is largely
concerned with overturning the theistic view that the intricacy and variety of
the natural world lend support to the idea of a divine Creator. Its title pays
homage to Rev. William Paley (1743-1805), who argued that if we were to stumble
across a finely-wrought pocket-watch in a field, then we would rightly assume
that it was designed and built with skilful agency. In the same way, it seems
impossible to imagine that the wondrous breadth and adaptedness exhibited in
plants, animals and humanity could be the product of anything but a
hugely-powerful Designer and Builder, God. As one contemporary critic of Darwin
put it, it makes no sense whatsoever to replace the ‘Absolute Wisdom’ of the
Creator with the ‘Absolute Ignorance’ of evolution as the originating force
behind life. Yet the evidence in support of evolution has mounted to an almost
incontrovertible degree over the last century, and is now accepted throughout
the Western world (with the exception of a few Bible belt American states).
Computational modelling can
provide highly persuasive, if circumstantial, evidence to support a theory.
Craig Reynolds’ impressive ‘boids’, a virtual simulation of the flight of a
flock of birds, provides an excellent example. Although it was no longer
seriously considered that birds communicated telepathically in some way to
maintain the highly-synchronised agility of an entire flock, Reynolds
completely banished the idea. Drawing on his skills as animator and programmer,
he modelled the activity of a number of individual birds, each following local
rules, giving rise to a responsive mass movement that gave every impression of
centralised organisation. In this way, having a computational model whose
results match real-life data strongly suggests that we understand and have
incorporated the crucial mechanisms and computation going on in a complex
system.
This movement towards
biologically-inspired computation gains part of its impetus from trying to
understand by emulating. However, genetic algorithms are also attractive for
another, more pragmatic reason: Mother Nature has solved the problem of
survival in a difficult environment in countless and increasingly complex ways,
and we might be able to borrow her powerful techniques to apply to other
problems of our own. This sort of thinking is also behind the closely-related
approach of computational neuroscience, which uses biologically-plausible (or
even just ‘biologically-inspired’) models to try to fathom and replicate the
workings of the brain. Both
connectionism and evolutionary computation are massively parallel, adaptive,
are able to learn and are even ‘creative’ compared to more traditional
algorithms. Both are attractive because they promise robust, generalisable
solutions that can be generated spontaneously out of incomplete training data
or in vast problem spaces.
Before considering genetic
algorithms, I will briefly describe the theory of evolution itself. Though
straight-forward enough in principle, it engenders enormous complexity, both in
the natural world and as an idea.
Every living organism has a
genetic code (the genotype), instantiated chemically as DNA. This genetic code
acts like a recipe, in combination with the environment, for what the organism
grows into (the phenotype). If the organism is healthy, well-adapted to its
environment and sexually attractive to mates of its species, it will live long
enough to reproduce. Usually, this results in both parents contributing part of
their genetic code to their young, and the process continues. Of course, the
details vary greatly from species to species.
At the population level, this
gives rise to a group of organisms, all interacting, reproducing together and
affecting each other’s fitness. If a population splits into two gene pools
which don’t mingle for a period of time, sub-speciation occurs. The two populations
may become differently adapted; if a few organisms migrate from one population
to to the other, their genes may proliferate through the new population.
A genetic algorithm is a computer
program designed to take advantage of mechanisms employed in evolution to
search a vast problem space very quickly and efficiently. According to Mitchell
(1998), ‘most methods called ‘GAs’ have at least the following elements in
common: populations of chromosomes; selection according to fitness; crossover
to produce new offspring; and random mutation of new offspring’. More strictly
then, a genetic algorithm is a biologically-inspired ‘parallel population-based
search [for solutions] with stochastic selection of many invidividuals,
stochastic crossover and mutation’. However, the term itself can be confusing,
since it can be used to mean the whole procedure of evolving a solution to a
problem, as well as being used for an individual evolved solution.
Using the evolutionary metaphor,
each organism can be thought of as a point in the search space of candidate
solutions. Together the various solutions being tried at a time comprise the
population, which evolves towards a maximum through the generations.
Like living organisms, genetic
algorithms have a genotype, usually comprised of a single chromosome. The
chromosome of a standard genetic algorithm (Holland 1975) is made up of a bit
string of length l, with one or more
binary bits making up a single gene (i.e. with alleles of 1 or 0). These bits
correspond to the nucleotides in natural genes, which have one of four bases
attached (labelled A, G, C or T).
In many basic genetic algorithms,
the genotype is fed directly into the fitness function as its parameters,
yielding a fitness value for that organism. A proportion (sometimes probabilistic)
of the population are allowed to pass their genes onto the next generation in
some modified form, having been recombined and/or mutated. This usually
continues for a fixed number of generations, known as a run. Experimenters may
report a number of statistics, e.g. average fitness of population over a number
of runs, highest fitness of any organism (at a given generation), or rate of
fitness improval.
In all these ways, experimenters
have borrowed heavily from evolution, and indeed the terminology reflects the
metaphor. Even the crudest genetic algorithm can, in this way, demonstrate the
power of evolution. However, as in computational neuroscience, the temptation
is strong to try and improve on Mother Nature’s necessarily expedient mechanisms.
Alternatively, peculiarities of natural evolution which are not considered to
improve or speed up the the evolutionary search are often discarded, altered or
simplified.
In computational neuroscience,
this has led to a schism between neural networks researchers and
connectionists. NN researchers argue that the brain is the only example we have
of advanced intelligence, and that the best means we have of achieving similar
results is to try and reproduce its functionality along the same lines. For
example, this restricts them to local learning rules like Hebbian learning,
where the weight at each synapse is modified on the basis of the activation of
neurons local to it. On the other hand, connectionists depend heavily on
algorithms like back-propagation, which could not feasibly be implemented in
the brain (since it either requires two-way passage of information, or for many
synapses to have information about a number of other distant and unconnected
synapses). Connectionists are prepared to employ immediately powerful,
mathematically-derived techniques in the hope of getting impressive results
faster. I feel strongly that connectionist techniques will ultimately prove
most successful, but not until we have learn all that we can from biological
plausibility. As long as the brain’s higher level functions are as mysterious
to us as they are now, emphasising biological plausibility will be by far the
most fruitful approach. Of course, there are definitely some areas where it’s
difficult to know whether we’re doing things differently or not, since we
really don’t know exactly what nature is doing.
It might be that genetic
algorithms are different to neural networks, and that we are in a better
position to simply take our inspiration from biology and benefit from improved
techniques immediately. The story of W. Daniel Hillis’ investigation of genetic
algorithms in sorting networks suggests though that we have not yet exhausted
the lessons we can learn from nature in this area either though.
There is one other good reason for
paying close attention to natural evolution: although there is a great deal of
variation between species, the genetic mechanism of the different terrestrial
walks of life is remarkably universal. In fact, possessing DNA has actually
been proposed as a single, general criterion for being alive. In fact, it would
probably work very well as a criterion for life on Earth. Unfortunately though,
it is probably too narrow a criterion for life in general, since it would
almost certainly rule out almost all extra-terrestrial life (no matter how
intelligent or life-like) on what we would probably have to regard as spurious,
carbon chauvinist grounds. Although both haploid and diploid organisms exist,
for instance, natural evolution follows fairly universal rules: an allele
alphabet of four particular bases (one different in RNA), multi-point
recombination, low mutation, double-helix structure of DNA etc. Given the size
and variety of the search space within which nature operates, it might simply
be that nature’s chosen parameters are amongst the very most robust, adaptable
and powerful, and we would do well to consider them first.
Given these reasons for placing a
good deal of trust in biological plausibility, I will consider why and how
experimenters have chosen to build their genetic algorithms differently from
natural evolution, grouping these differences broadly into:
the
genetic level, including the genetic mechanism, coding and structure, the
selection operator and reproduction;
the
phenotype level, in terms of the fitness function and the ecology and
environment; and
miscellaneous.
The differences between genetic
algorithms and natural evolution at the genetic level are probably largely
cosmetic, but mutation provides an example of where traditional biological
assumptions about genetic mechanisms have been challenged.
As mentioned above, genetic
algorithms standardly employ a binary bit string for a chromosome, rather than
the quaternary alphabet used in nature. It is difficult to see why this should
make a difference, since a 4-valued system can be trivially expressed in terms
of 2 bits. Computers prefer binary, while nature can save space with a bigger
alphabet. The only time it might matter is in recombination, where nature’s
quaternary bit cannot be split apart, whereas two binary bits can. Furthermore,
some genetic algorithms use continuous alleles, which might prove different to
a discrete system. A category system, employing ‘A’, ‘G’, ‘C’ and ‘T’ has no
ranking or intervals by which to perform arithmetic with its alleles, i.e.
nature’s system is symbolic rather than numeric, which might give rise to a
different or even restricted set of low-level effects.
GAs sometimes employ Gray encoding[1] rather than pure binary.
This is a simple transformation, which avoids changing more than bit at a time
when incrementing. In binary, adding 1 to 0111 gives 1000, which can be
troublesome for GAs (especially during mutation, where small digital changes could
amount to large value changes). A Gray code transformation for binary would
give:
Gray Sequential
000 000
001 001
011 010
010 011
110 100
111 101
101 110
100 111
Given binary alleles, Gray
encoding should make no difference.
Frequently, GAs have only a single
chromosome. In fact, researchers often refer to chromosomes, rather than
organisms. Secondary chromosomes are occasionally used to store dormant traits
or other data not directly related to the primary problem. For the most part,
GA chromosomes are one-dimensional strings, like those in nature, although
sometimes researchers experiment with multi-dimensional arrays, trees or hash
tables to better mirror the solution space.
Although we are most used to a
diploid genome, nature does not employ them exclusively, especially when one
parent is not contributing its genetic code. Haploid may be simpler, but less
powerful, given that multi-cellular organisms tend to be diploid.
The recombination operator is
usually considered to be the most important transformation when creating the
genome to be passed on. In single point recombination, a single location along
each chromosome is chosen, with the first section being contributed from one
parent and the second section from the other. Multi-point recombination is used
in nature, where the two chromosomes are interleaved with each other at
multiple locations. Early Schema Theory (Holland 1975) was based on
single-point recombination, which probably limits non-local interaction between
genes (see below).
Mutation, in the ‘blind’ sense
(Hart and Kammeyer, 1994) of random variations in bits seems to arise in nature
mainly out of copying errors. Until Holland tried a simulation to test the
effects on a GA without mutation, biologists had assumed that it played a
critical role in moving through the problem space. Indeed, Yaeger even recounts
how he hadn’t even realised that he’d forgottent o switch mutation on for the
first weeks of his PolyWorld investigation. Currently, recombination is
believed to play the bigger role, but Anastasoff (1999) has challenged this,
arguing that an evolving, dynamic level of mutation is extremely valuable in
helping populations to respond to non-static fitness functions, which most test
functions until recently have relied upon. Holland’s suggestion that ‘mutation
provides an ‘insurance policy’ against fixation’ seems plausible too.
Besides recombination, there are a
number of other transformations that may occur during reproduction, including
inversion, gene doubling and deletion. GAs so far have made little use of these
methods.
As mentioned above, single point
recombination reduces the possibility of non-local interaction between genes.
Indeed, this was how the Building Block hypothesis grew up: ‘good solutions
tend to be made up of good building blocks – [contiguous] combinations of bit
values that confer higher fitness on the string in which they are present’
(Mitchell, 1998). Holland (1975) introduced schemas ‘to formalise the informal
notion of ‘building blocks’. A schema is like a mask that can be overlaid onto
a bit string, with each position being a defined bit (fixed to 1 or 0) or an
undefined bit (wildcards, which can be any value), e.g. 1***1 is a schema for a
5-bit string which begins and ends with a 1.
In fact, the theory could be
extended to include cross-chromosomal schemas. However, it only really works
with single-point crossovers. With multiple-point crossovers, it might be
possible to have non-local interaction, e.g. the bits at the beginning and end
of a chromosome both being part of a single, geographically distributed
building block, since it would then be possible for an even number of crossovers
to have occurred in between the two points.
Natural evolution employs a host
of other, more complicated transforms, known collectively as gene regulation.
These include translocation (where chunks of genetic code are moved from one
part of the chromosome to another), introns (which are inert, and do not code
for amino acids themselves), as well as lots of genes whose activity depends on
other genes, and who may not express themselves in a particular organism,
gender or generation, but are carried down (dormant) to resurface later.
Differences between GAs and
natural evolution are probably most obvious and significant at the phenotype
level.
Maturation (Hart and Kammeyer,
1994) is the translation from the genotype to the phenotype, i.e. the process
by which the genes express themselves as an organism within the environment
with a measurable fitness function. GAs have been applied to a host of mainly
mathematical problems, where it is easiest to simply translate the genotype
space directly into the phenotype space, using the genes as parameters in a
simple algorithmic fitness function. However, I think that this lack of a
phenotypic level, an extra disconnection between the genotype and its fitness
function, may limit the power and flexibility of these GAs.
Sometimes though, the genotype is
translated into a very different phenotype, e.g. when the genotype expresses
itself as a neural network. The way this is done varies. In some cases, the GA
evolves the actual synaptic weights themselves, which seems rather strange,
though possibly faster in some circumstances. Alternatively, the genotype can
be used to set the parameters and architecture of the network, e.g.
connectivity, thresholds or activation function, number of layers, number of
hidden neurons, learning coefficient, pattern depth (i.e. from binary to
continuous-valued) etc. In this case, the GA chooses the type of network it
thinks will be most suitable for learning a problem, and leaves the network to
train itself on the data. This is much closer to biological reality, and
employs the two most powerful biological computational techniques together,
reducing the need for human intervention, and taking the best from both worlds
when navigating through the search space. Ackley’s AL world provides some
particularly good examples of learned behaviour that becomes innate. This is
known as the Baldwin effect, a non-Lamarckian theory that postulates that
learned behaviour can change the fitness function, because it rewards physical
tendencies that you’ve learned to take advantage of, e.g. squirrels that learn
to fly between trees benefit from the gradual evolution of webbed toes, whereas
squirrels that don’t learn this don’t. After many generations, it is as though
the flying squirrels have evolved to seem born
to fly.
The logical next step really
approaches biological reality, and has been partially attempted in some
simulations. If we can evolve brains, why can’t we evolve bodies? This requires
a physical environment - the more realistic the physics the better. In the
ultimate life-like simulation, researchers could tweak the parameters of the
Earth’s environment to realistically conceive what alien physiology might be
like, and how they would fare on our planet. Karl Sims’ creatures are perhaps
the most visually life-like of any a-life simulations so far, waddling and
struggling with a very real sense of urgency and intention, using mismatched
limbs in a quite impressively co-ordinated fashion to propel themselves.
This brings GAs a step closer
towards an endogenous fitness function. Natural selection is an endogenous
fitness function – living organisms effectively select themselves through a
combination of viability and fertility. GAs employ artificial, decidedly
unnatural fitness functions, since the problems that we want them to solve are
often quite unnatural. However, there are some problems that we could express
in terms of the environment, thus using a GA to solve a problem whose fitness
function is endogenous, e.g. a path-finding algorithm, where the GA phenotype
organisms have to find their way through a maze to get to the food at the end,
or where GAs have to evolve to be good at the Prisoner’s Dilemma in order to
spend enough time outside prison to procreate.
There is a further reason for
translating into an embodied phenotype of some description. Nature uses four
extremely powerful bootstrapping methods: evolution, connectionism, ontogenetic
growth and cultural inheritance. Like real organisms, GA phenotype organisms
could grow into adulthood, learning with neural network brains to imitate and
follow social conventions. For the moment, such projects are highly
computationally expensive, as well as in programmer man-hours, and probably
overstretch current understanding in developmental physiology and ethology.
There is a last reason for wanting
to express the phenotype in embodied terms. In nature, the genotype itself is
expressed as chemicals, as part of the phenotype, which means that the genetic
mechanism itself can be expressed and coded for within the genes. This effect
of genes affecting the genetic mechanism itself can be seen in an obvious way
in the Anastasoff (1999) paper, where the mutation level is itself coded for,
making the population potentially more responsive to changes in the fitness
function. By coding for the genetic mechanism in the phenotype translation, the
problem space becomes potentially boundless, as well as raising chicken-and-egg
questions about how the genetic mechanism came about in the first place itself.
Rather than being fitness being
evaluated on the basis of individual fitness, independent of other organisms in
the environment, fitness in nature (i.e. survival value) depends crucially on
other members of the population, members of other populations, and other
species, whether predators, prey, parasites or hosts. Interactions with other
organisms that alter the fitness function and whose fitness function is altered
in turn forces organisms to be flexible and inventive. When subgroups of
populations are restricted from intermingling, speciation within the different
groups occurs, with each sub-species being adaptive in slightly different ways.
When individuals from one group migrate to the other, stronger genes
proliferate, giving rise to more generalised solutions. Other such interactions
can lead to symbiosis, cooperation (e.g. Axelrod, 1984) and special kinship
relations. On the other hand, Hillis’ work on sorting networks involving host
solution organisms being constantly tested by parasite test case organisms is a
brilliant example of how a more global optimum can be reached when the fitness
function is becoming increasingly stringent. Most GAs neglect this extra
dimension (and complication), despite its speeding up the search and propensity
to shift organisms off local optima.
Although it may not be a big
factor, nature employs a continuous stream of births and deaths rather than a
series of discrete, non-overlapping generations. This can be partly remedied by
employing a Plus Strategy, where successful organisms may be included in the next generation, as opposed
to the Comma Strategy, where each generation is spawned entirely afresh from
the selected parents of the previous generation.
There are a number of remaining
ways in which GAs differ from natural evolution.
There are those who would point to
a fundamental, ineliminable difference between organisms which have naturally
evolved and GA organisms. This sounds rather like the arguments against Strong
A-life, which revolve around the contention that even if it looks like life, is
as complex as life, and is even computationally indistinguishable from
something alive, if it lives in a computer, then it just isn’t life. To them,
we can quote Steen Rasmussen’s contentious pamphlet, “Aspects of Information,
Life Reality and Physics”, distributed at the second A-life conference and
which began like this:
1.
A universal computer is indeed universal and can emulate
any process (Turing)
2.
The essence of life is a process (von Neumann)
3.
There exist criteria by which we are able to distinguish
living from non-living things
Accepting (1), (2) and (3) implies the
possibility of life in a computer
All of the premises are
contentious to some degree, but the argument is still persuasive. If live
contains a non-computable element, a universal computer (in the sense that
Turing originally imagined it) will not be able to emulate it. Most scientists
would probably agree with the second premise, though most people that we are
trying to persuade might not. Most definitions of life do characterise it in
terms of processes (nutrition, respiration, excretion, movement, growth,
stimulus response, reproduction/self-replication, transduction in general,
complexity and increased orderliness), but again, these need to be computable
premises. The third premise is hazy – although we can rank life-ness, we have
trouble placing a barrier between alive and not-alive, e.g. viruses, which seem
a grey area. In summary though, denigrating GA organisms too dismissively is
not a particularly easily justified position. The functionalist assumptions
behind neural networks and genetic algorithms hold that it's the aspects of the
underlying computation rather than the physiological instantiation that
matters, and this is what is being modelled.
In the real world, quite
unexpected things happen. The dinosaurs weren’t adapted to or adaptable enough
for falling meteors, and they died out. If there was a nuclear holocaust, there
would probably just be scorpions and cockroaches left. If great robustness is
important (e.g. the need for the solution to survive terrible connections or
reception), then incorporating such huge, fitness-affecting events can weed out
all but the most robust, as well as dislodging populations from local minima.
Of course, in a way, starting the entire simulation again with different
randomised genomes is a little bit like the Flood.
By modelling GAs on serial
computers, indefinite expansion into genotypic space is limited by computer
time. Moreover, it takes a lot of programmer time to create the very complex,
harsh environments that are necessary to develop very tough, intelligent,
creative organisms, such as ourselves.
GAs usually employ fixed
population sizes. At the end of each generation, most of the population is
culled, and a new generation of the same size spring up to replace them. In the
real world, populations grow as large as resources can sustain, and they spread
geographically into sub-species.
As mentioned, the complexity of
effects that can develop is constrained, but with most fitness functions, this
should only really affect the likelihood of finding the very best global optima.
Mitchell provides an impressive
and prescient summary of potential areas to investigate in GA research. Many
have been incorporated here. Perhaps the most important and likely to emerge
will be:
fuller
phenotypes – organisms will be more than parameters entered into a fitness
function equation; as a-life techniques improve, our ability to integrate GAs,
NNs and real physics simulations will improve, making GAs more commercially
attractive (e.g. in the form of toys or web-based intelligent agents, for example)
more
complex genotype transformations and interactions – the mapping of the human
genome may help us understand better what is happening at this level
increased
complexity of genomes, environment, populations, number of organisms and
generations etc. – with Moore’s Law, we can expect the size and difficulty of
problem space that our GAs can traverse in reasonable time to continue to
increase (especially with the huge increase in distributed, peer-to-peer
computing that seems certain)
Notably, I think these future
advances will bring GAs closer to natural evolution, not break away from it.
At the moment though, the major
differences between GAs and natural evolution lie in the simplifications and
tweakings of most GA simulations. Functionally, GAs and natural evolution are
very similar. Given a computer the size of the Earth, I cannot see any
fundamental reason why we couldn’t construct a GA on the scale of a planet. Of
course, that is not to say that we could accurately reproduce or predict the
actual evolution on the Earth (since we would still be restricted by our
measurement of the initial conditions, our models of physics, and presumably,
by quantum mechanics). But we would expect very evolutionarily real effects to
emerge on a global scale, and if we were to kick-start single-cellular life, we
might expect (over a period of perhaps billions of years, and given a complex,
suitable environment) at least human-level intelligence to emerge.