Title

Discuss the similarities and differences between genetic algorithms and natural evolution.

Greg Detre

Monday, May 14, 2001

Animal Behaviour IV

Richard Dawkins’ The Blind Watchmaker is largely concerned with overturning the theistic view that the intricacy and variety of the natural world lend support to the idea of a divine Creator. Its title pays homage to Rev. William Paley (1743-1805), who argued that if we were to stumble across a finely-wrought pocket-watch in a field, then we would rightly assume that it was designed and built with skilful agency. In the same way, it seems impossible to imagine that the wondrous breadth and adaptedness exhibited in plants, animals and humanity could be the product of anything but a hugely-powerful Designer and Builder, God. As one contemporary critic of Darwin put it, it makes no sense whatsoever to replace the ‘Absolute Wisdom’ of the Creator with the ‘Absolute Ignorance’ of evolution as the originating force behind life. Yet the evidence in support of evolution has mounted to an almost incontrovertible degree over the last century, and is now accepted throughout the Western world (with the exception of a few Bible belt American states).

Computational modelling can provide highly persuasive, if circumstantial, evidence to support a theory. Craig Reynolds’ impressive ‘boids’, a virtual simulation of the flight of a flock of birds, provides an excellent example. Although it was no longer seriously considered that birds communicated telepathically in some way to maintain the highly-synchronised agility of an entire flock, Reynolds completely banished the idea. Drawing on his skills as animator and programmer, he modelled the activity of a number of individual birds, each following local rules, giving rise to a responsive mass movement that gave every impression of centralised organisation. In this way, having a computational model whose results match real-life data strongly suggests that we understand and have incorporated the crucial mechanisms and computation going on in a complex system.

This movement towards biologically-inspired computation gains part of its impetus from trying to understand by emulating. However, genetic algorithms are also attractive for another, more pragmatic reason: Mother Nature has solved the problem of survival in a difficult environment in countless and increasingly complex ways, and we might be able to borrow her powerful techniques to apply to other problems of our own. This sort of thinking is also behind the closely-related approach of computational neuroscience, which uses biologically-plausible (or even just ‘biologically-inspired’) models to try to fathom and replicate the workings of the brain. Both connectionism and evolutionary computation are massively parallel, adaptive, are able to learn and are even ‘creative’ compared to more traditional algorithms. Both are attractive because they promise robust, generalisable solutions that can be generated spontaneously out of incomplete training data or in vast problem spaces.

Before considering genetic algorithms, I will briefly describe the theory of evolution itself. Though straight-forward enough in principle, it engenders enormous complexity, both in the natural world and as an idea.

Every living organism has a genetic code (the genotype), instantiated chemically as DNA. This genetic code acts like a recipe, in combination with the environment, for what the organism grows into (the phenotype). If the organism is healthy, well-adapted to its environment and sexually attractive to mates of its species, it will live long enough to reproduce. Usually, this results in both parents contributing part of their genetic code to their young, and the process continues. Of course, the details vary greatly from species to species.

At the population level, this gives rise to a group of organisms, all interacting, reproducing together and affecting each other’s fitness. If a population splits into two gene pools which don’t mingle for a period of time, sub-speciation occurs. The two populations may become differently adapted; if a few organisms migrate from one population to to the other, their genes may proliferate through the new population.

A genetic algorithm is a computer program designed to take advantage of mechanisms employed in evolution to search a vast problem space very quickly and efficiently. According to Mitchell (1998), ‘most methods called ‘GAs’ have at least the following elements in common: populations of chromosomes; selection according to fitness; crossover to produce new offspring; and random mutation of new offspring’. More strictly then, a genetic algorithm is a biologically-inspired ‘parallel population-based search [for solutions] with stochastic selection of many invidividuals, stochastic crossover and mutation’. However, the term itself can be confusing, since it can be used to mean the whole procedure of evolving a solution to a problem, as well as being used for an individual evolved solution.

Using the evolutionary metaphor, each organism can be thought of as a point in the search space of candidate solutions. Together the various solutions being tried at a time comprise the population, which evolves towards a maximum through the generations.

Like living organisms, genetic algorithms have a genotype, usually comprised of a single chromosome. The chromosome of a standard genetic algorithm (Holland 1975) is made up of a bit string of length l, with one or more binary bits making up a single gene (i.e. with alleles of 1 or 0). These bits correspond to the nucleotides in natural genes, which have one of four bases attached (labelled A, G, C or T).

In many basic genetic algorithms, the genotype is fed directly into the fitness function as its parameters, yielding a fitness value for that organism. A proportion (sometimes probabilistic) of the population are allowed to pass their genes onto the next generation in some modified form, having been recombined and/or mutated. This usually continues for a fixed number of generations, known as a run. Experimenters may report a number of statistics, e.g. average fitness of population over a number of runs, highest fitness of any organism (at a given generation), or rate of fitness improval.

In all these ways, experimenters have borrowed heavily from evolution, and indeed the terminology reflects the metaphor. Even the crudest genetic algorithm can, in this way, demonstrate the power of evolution. However, as in computational neuroscience, the temptation is strong to try and improve on Mother Nature’s necessarily expedient mechanisms. Alternatively, peculiarities of natural evolution which are not considered to improve or speed up the the evolutionary search are often discarded, altered or simplified.

In computational neuroscience, this has led to a schism between neural networks researchers and connectionists. NN researchers argue that the brain is the only example we have of advanced intelligence, and that the best means we have of achieving similar results is to try and reproduce its functionality along the same lines. For example, this restricts them to local learning rules like Hebbian learning, where the weight at each synapse is modified on the basis of the activation of neurons local to it. On the other hand, connectionists depend heavily on algorithms like back-propagation, which could not feasibly be implemented in the brain (since it either requires two-way passage of information, or for many synapses to have information about a number of other distant and unconnected synapses). Connectionists are prepared to employ immediately powerful, mathematically-derived techniques in the hope of getting impressive results faster. I feel strongly that connectionist techniques will ultimately prove most successful, but not until we have learn all that we can from biological plausibility. As long as the brain’s higher level functions are as mysterious to us as they are now, emphasising biological plausibility will be by far the most fruitful approach. Of course, there are definitely some areas where it’s difficult to know whether we’re doing things differently or not, since we really don’t know exactly what nature is doing.

It might be that genetic algorithms are different to neural networks, and that we are in a better position to simply take our inspiration from biology and benefit from improved techniques immediately. The story of W. Daniel Hillis’ investigation of genetic algorithms in sorting networks suggests though that we have not yet exhausted the lessons we can learn from nature in this area either though.

There is one other good reason for paying close attention to natural evolution: although there is a great deal of variation between species, the genetic mechanism of the different terrestrial walks of life is remarkably universal. In fact, possessing DNA has actually been proposed as a single, general criterion for being alive. In fact, it would probably work very well as a criterion for life on Earth. Unfortunately though, it is probably too narrow a criterion for life in general, since it would almost certainly rule out almost all extra-terrestrial life (no matter how intelligent or life-like) on what we would probably have to regard as spurious, carbon chauvinist grounds. Although both haploid and diploid organisms exist, for instance, natural evolution follows fairly universal rules: an allele alphabet of four particular bases (one different in RNA), multi-point recombination, low mutation, double-helix structure of DNA etc. Given the size and variety of the search space within which nature operates, it might simply be that nature’s chosen parameters are amongst the very most robust, adaptable and powerful, and we would do well to consider them first.

Given these reasons for placing a good deal of trust in biological plausibility, I will consider why and how experimenters have chosen to build their genetic algorithms differently from natural evolution, grouping these differences broadly into:

the genetic level, including the genetic mechanism, coding and structure, the selection operator and reproduction;

the phenotype level, in terms of the fitness function and the ecology and environment; and

miscellaneous.

The differences between genetic algorithms and natural evolution at the genetic level are probably largely cosmetic, but mutation provides an example of where traditional biological assumptions about genetic mechanisms have been challenged.

o The alphabet of the alleles – binary vs quaternary, discrete vs continuous

As mentioned above, genetic algorithms standardly employ a binary bit string for a chromosome, rather than the quaternary alphabet used in nature. It is difficult to see why this should make a difference, since a 4-valued system can be trivially expressed in terms of 2 bits. Computers prefer binary, while nature can save space with a bigger alphabet. The only time it might matter is in recombination, where nature’s quaternary bit cannot be split apart, whereas two binary bits can. Furthermore, some genetic algorithms use continuous alleles, which might prove different to a discrete system. A category system, employing ‘A’, ‘G’, ‘C’ and ‘T’ has no ranking or intervals by which to perform arithmetic with its alleles, i.e. nature’s system is symbolic rather than numeric, which might give rise to a different or even restricted set of low-level effects.

o Gray encoding

GAs sometimes employ Gray encoding[1] rather than pure binary. This is a simple transformation, which avoids changing more than bit at a time when incrementing. In binary, adding 1 to 0111 gives 1000, which can be troublesome for GAs (especially during mutation, where small digital changes could amount to large value changes). A Gray code transformation for binary would give:

Gray Sequential

000 000

001 001

011 010

010 011

110 100

111 101

101 110

100 111

Given binary alleles, Gray encoding should make no difference.

o Genome comprised of a single one-dimensional chromosome

Frequently, GAs have only a single chromosome. In fact, researchers often refer to chromosomes, rather than organisms. Secondary chromosomes are occasionally used to store dormant traits or other data not directly related to the primary problem. For the most part, GA chromosomes are one-dimensional strings, like those in nature, although sometimes researchers experiment with multi-dimensional arrays, trees or hash tables to better mirror the solution space.

o Haploid vs diploid

Although we are most used to a diploid genome, nature does not employ them exclusively, especially when one parent is not contributing its genetic code. Haploid may be simpler, but less powerful, given that multi-cellular organisms tend to be diploid.

o Recombination/crossover - 1 or more cross-overs, break within alleles, asexual, both parents contribute genetic code

The recombination operator is usually considered to be the most important transformation when creating the genome to be passed on. In single point recombination, a single location along each chromosome is chosen, with the first section being contributed from one parent and the second section from the other. Multi-point recombination is used in nature, where the two chromosomes are interleaved with each other at multiple locations. Early Schema Theory (Holland 1975) was based on single-point recombination, which probably limits non-local interaction between genes (see below).

o Mutation

Mutation, in the ‘blind’ sense (Hart and Kammeyer, 1994) of random variations in bits seems to arise in nature mainly out of copying errors. Until Holland tried a simulation to test the effects on a GA without mutation, biologists had assumed that it played a critical role in moving through the problem space. Indeed, Yaeger even recounts how he hadn’t even realised that he’d forgottent o switch mutation on for the first weeks of his PolyWorld investigation. Currently, recombination is believed to play the bigger role, but Anastasoff (1999) has challenged this, arguing that an evolving, dynamic level of mutation is extremely valuable in helping populations to respond to non-static fitness functions, which most test functions until recently have relied upon. Holland’s suggestion that ‘mutation provides an ‘insurance policy’ against fixation’ seems plausible too.

o Inversion, gene doubling, deletion

Besides recombination, there are a number of other transformations that may occur during reproduction, including inversion, gene doubling and deletion. GAs so far have made little use of these methods.

o Complex and non-local interaction between genes, gene regulation (dominance, translocation, sexual differentiation and introns)

As mentioned above, single point recombination reduces the possibility of non-local interaction between genes. Indeed, this was how the Building Block hypothesis grew up: ‘good solutions tend to be made up of good building blocks – [contiguous] combinations of bit values that confer higher fitness on the string in which they are present’ (Mitchell, 1998). Holland (1975) introduced schemas ‘to formalise the informal notion of ‘building blocks’. A schema is like a mask that can be overlaid onto a bit string, with each position being a defined bit (fixed to 1 or 0) or an undefined bit (wildcards, which can be any value), e.g. 1***1 is a schema for a 5-bit string which begins and ends with a 1.

In fact, the theory could be extended to include cross-chromosomal schemas. However, it only really works with single-point crossovers. With multiple-point crossovers, it might be possible to have non-local interaction, e.g. the bits at the beginning and end of a chromosome both being part of a single, geographically distributed building block, since it would then be possible for an even number of crossovers to have occurred in between the two points.

Natural evolution employs a host of other, more complicated transforms, known collectively as gene regulation. These include translocation (where chunks of genetic code are moved from one part of the chromosome to another), introns (which are inert, and do not code for amino acids themselves), as well as lots of genes whose activity depends on other genes, and who may not express themselves in a particular organism, gender or generation, but are carried down (dormant) to resurface later.

Differences between GAs and natural evolution are probably most obvious and significant at the phenotype level.

o Maturation – expression of the genotype in phenotypic space, NNs and embodiment, endogenous fitness function, ontogenetic and cultural computation, evolving the genetic mechanism itself

Maturation (Hart and Kammeyer, 1994) is the translation from the genotype to the phenotype, i.e. the process by which the genes express themselves as an organism within the environment with a measurable fitness function. GAs have been applied to a host of mainly mathematical problems, where it is easiest to simply translate the genotype space directly into the phenotype space, using the genes as parameters in a simple algorithmic fitness function. However, I think that this lack of a phenotypic level, an extra disconnection between the genotype and its fitness function, may limit the power and flexibility of these GAs.

Sometimes though, the genotype is translated into a very different phenotype, e.g. when the genotype expresses itself as a neural network. The way this is done varies. In some cases, the GA evolves the actual synaptic weights themselves, which seems rather strange, though possibly faster in some circumstances. Alternatively, the genotype can be used to set the parameters and architecture of the network, e.g. connectivity, thresholds or activation function, number of layers, number of hidden neurons, learning coefficient, pattern depth (i.e. from binary to continuous-valued) etc. In this case, the GA chooses the type of network it thinks will be most suitable for learning a problem, and leaves the network to train itself on the data. This is much closer to biological reality, and employs the two most powerful biological computational techniques together, reducing the need for human intervention, and taking the best from both worlds when navigating through the search space. Ackley’s AL world provides some particularly good examples of learned behaviour that becomes innate. This is known as the Baldwin effect, a non-Lamarckian theory that postulates that learned behaviour can change the fitness function, because it rewards physical tendencies that you’ve learned to take advantage of, e.g. squirrels that learn to fly between trees benefit from the gradual evolution of webbed toes, whereas squirrels that don’t learn this don’t. After many generations, it is as though the flying squirrels have evolved to seem born to fly.

The logical next step really approaches biological reality, and has been partially attempted in some simulations. If we can evolve brains, why can’t we evolve bodies? This requires a physical environment - the more realistic the physics the better. In the ultimate life-like simulation, researchers could tweak the parameters of the Earth’s environment to realistically conceive what alien physiology might be like, and how they would fare on our planet. Karl Sims’ creatures are perhaps the most visually life-like of any a-life simulations so far, waddling and struggling with a very real sense of urgency and intention, using mismatched limbs in a quite impressively co-ordinated fashion to propel themselves.

This brings GAs a step closer towards an endogenous fitness function. Natural selection is an endogenous fitness function – living organisms effectively select themselves through a combination of viability and fertility. GAs employ artificial, decidedly unnatural fitness functions, since the problems that we want them to solve are often quite unnatural. However, there are some problems that we could express in terms of the environment, thus using a GA to solve a problem whose fitness function is endogenous, e.g. a path-finding algorithm, where the GA phenotype organisms have to find their way through a maze to get to the food at the end, or where GAs have to evolve to be good at the Prisoner’s Dilemma in order to spend enough time outside prison to procreate.

There is a further reason for translating into an embodied phenotype of some description. Nature uses four extremely powerful bootstrapping methods: evolution, connectionism, ontogenetic growth and cultural inheritance. Like real organisms, GA phenotype organisms could grow into adulthood, learning with neural network brains to imitate and follow social conventions. For the moment, such projects are highly computationally expensive, as well as in programmer man-hours, and probably overstretch current understanding in developmental physiology and ethology.

There is a last reason for wanting to express the phenotype in embodied terms. In nature, the genotype itself is expressed as chemicals, as part of the phenotype, which means that the genetic mechanism itself can be expressed and coded for within the genes. This effect of genes affecting the genetic mechanism itself can be seen in an obvious way in the Anastasoff (1999) paper, where the mutation level is itself coded for, making the population potentially more responsive to changes in the fitness function. By coding for the genetic mechanism in the phenotype translation, the problem space becomes potentially boundless, as well as raising chicken-and-egg questions about how the genetic mechanism came about in the first place itself.

o Interactions among population members. e.g. coevolution, symbiosis and competition, interaction with other populations, e.g. sub-species and migration, interactions with other species, e.g. predators and trophic levels

Rather than being fitness being evaluated on the basis of individual fitness, independent of other organisms in the environment, fitness in nature (i.e. survival value) depends crucially on other members of the population, members of other populations, and other species, whether predators, prey, parasites or hosts. Interactions with other organisms that alter the fitness function and whose fitness function is altered in turn forces organisms to be flexible and inventive. When subgroups of populations are restricted from intermingling, speciation within the different groups occurs, with each sub-species being adaptive in slightly different ways. When individuals from one group migrate to the other, stronger genes proliferate, giving rise to more generalised solutions. Other such interactions can lead to symbiosis, cooperation (e.g. Axelrod, 1984) and special kinship relations. On the other hand, Hillis’ work on sorting networks involving host solution organisms being constantly tested by parasite test case organisms is a brilliant example of how a more global optimum can be reached when the fitness function is becoming increasingly stringent. Most GAs neglect this extra dimension (and complication), despite its speeding up the search and propensity to shift organisms off local optima.

o Discrete, non-overlapping generations

Although it may not be a big factor, nature employs a continuous stream of births and deaths rather than a series of discrete, non-overlapping generations. This can be partly remedied by employing a Plus Strategy, where successful organisms may be included in the next generation, as opposed to the Comma Strategy, where each generation is spawned entirely afresh from the selected parents of the previous generation.

There are a number of remaining ways in which GAs differ from natural evolution.

o GA organisms are virtual rather than chemical

There are those who would point to a fundamental, ineliminable difference between organisms which have naturally evolved and GA organisms. This sounds rather like the arguments against Strong A-life, which revolve around the contention that even if it looks like life, is as complex as life, and is even computationally indistinguishable from something alive, if it lives in a computer, then it just isn’t life. To them, we can quote Steen Rasmussen’s contentious pamphlet, “Aspects of Information, Life Reality and Physics”, distributed at the second A-life conference and which began like this:

1. A universal computer is indeed universal and can emulate any process (Turing)

2. The essence of life is a process (von Neumann)

3. There exist criteria by which we are able to distinguish living from non-living things

Accepting (1), (2) and (3) implies the possibility of life in a computer

All of the premises are contentious to some degree, but the argument is still persuasive. If live contains a non-computable element, a universal computer (in the sense that Turing originally imagined it) will not be able to emulate it. Most scientists would probably agree with the second premise, though most people that we are trying to persuade might not. Most definitions of life do characterise it in terms of processes (nutrition, respiration, excretion, movement, growth, stimulus response, reproduction/self-replication, transduction in general, complexity and increased orderliness), but again, these need to be computable premises. The third premise is hazy – although we can rank life-ness, we have trouble placing a barrier between alive and not-alive, e.g. viruses, which seem a grey area. In summary though, denigrating GA organisms too dismissively is not a particularly easily justified position. The functionalist assumptions behind neural networks and genetic algorithms hold that it's the aspects of the underlying computation rather than the physiological instantiation that matters, and this is what is being modelled.

o Acts of God

In the real world, quite unexpected things happen. The dinosaurs weren’t adapted to or adaptable enough for falling meteors, and they died out. If there was a nuclear holocaust, there would probably just be scorpions and cockroaches left. If great robustness is important (e.g. the need for the solution to survive terrible connections or reception), then incorporating such huge, fitness-affecting events can weed out all but the most robust, as well as dislodging populations from local minima. Of course, in a way, starting the entire simulation again with different randomised genomes is a little bit like the Flood.

o Complexity and harshness of environment, the number of genes and the size of the organism (in multi-cellular organisms, anyway), length of evolutionary time, size of populations

By modelling GAs on serial computers, indefinite expansion into genotypic space is limited by computer time. Moreover, it takes a lot of programmer time to create the very complex, harsh environments that are necessary to develop very tough, intelligent, creative organisms, such as ourselves.

GAs usually employ fixed population sizes. At the end of each generation, most of the population is culled, and a new generation of the same size spring up to replace them. In the real world, populations grow as large as resources can sustain, and they spread geographically into sub-species.

As mentioned, the complexity of effects that can develop is constrained, but with most fitness functions, this should only really affect the likelihood of finding the very best global optima.

Mitchell provides an impressive and prescient summary of potential areas to investigate in GA research. Many have been incorporated here. Perhaps the most important and likely to emerge will be:

fuller phenotypes – organisms will be more than parameters entered into a fitness function equation; as a-life techniques improve, our ability to integrate GAs, NNs and real physics simulations will improve, making GAs more commercially attractive (e.g. in the form of toys or web-based intelligent agents, for example)

more complex genotype transformations and interactions – the mapping of the human genome may help us understand better what is happening at this level

increased complexity of genomes, environment, populations, number of organisms and generations etc. – with Moore’s Law, we can expect the size and difficulty of problem space that our GAs can traverse in reasonable time to continue to increase (especially with the huge increase in distributed, peer-to-peer computing that seems certain)

Notably, I think these future advances will bring GAs closer to natural evolution, not break away from it.

At the moment though, the major differences between GAs and natural evolution lie in the simplifications and tweakings of most GA simulations. Functionally, GAs and natural evolution are very similar. Given a computer the size of the Earth, I cannot see any fundamental reason why we couldn’t construct a GA on the scale of a planet. Of course, that is not to say that we could accurately reproduce or predict the actual evolution on the Earth (since we would still be restricted by our measurement of the initial conditions, our models of physics, and presumably, by quantum mechanics). But we would expect very evolutionarily real effects to emerge on a global scale, and if we were to kick-start single-cellular life, we might expect (over a period of perhaps billions of years, and given a complex, suitable environment) at least human-level intelligence to emerge.