03-07-2014, 01:25 PM
GENETIC PROGRAMMING
GENETIC PROGRAMMING.pdf (Size: 1.35 MB / Downloads: 15)
INTRODUCTION
The goal of getting computers to automatically solve problems is central
to artificial intelligence, machine learning, and the broad area encompassed
by what Turing called “machine intelligence” (Turing, 1948,
1950). In his talk entitled AI: Where It Has Been and Where It Is Going,
machine learning pioneer Arthur Samuel stated the main goal of the
fields of machine learning and artificial intelligence:
[T]he aim [is] . . . to get machines to exhibit behavior, which if done by
humans, would be assumed to involve the use of intelligence. (Samuel,
1983)
Genetic programming is a systematic method for getting computers
to automatically solve a problem starting from a high-level statement of
what needs to be done. Genetic programming is a domain-independent
method that genetically breeds a population of computer programs to
solve a problem. Specifically, genetic programming iteratively transforms
a population of computer programs into a new generation of programs
by applying analogs of naturally occurring genetic operations.
This process is illustrated in Fig. 5.1.
The genetic operations include crossover (sexual recombination), mutation,
reproduction, gene duplication, and gene deletion. Analogs of
developmental processes are sometimes used to transform an embryo
into a fully developed structure. Genetic programming is an extension
PREPARATORY STEPS OF GENETIC PROGRAMMING
Genetic programming starts from a high-level statement of the requirements
of a problem and attempts to produce a computer program
that solves the problem.
The human user communicates the high-level statement of the problem
to the genetic programming algorithm by performing certain welldefined
preparatory steps.
The five major preparatory steps for the basic version of genetic programming
require the human user to specify
1 the set of terminals (e.g., the independent variables of the problem,
zero-argument functions, and random constants) for each branch
of the to-be-evolved program,
2 the set of primitive functions for each branch of the to-be-evolved
program,
3 the fitness measure (for explicitly or implicitly measuring the fitness
of individuals in the population),
4 certain parameters for controlling the run, and
5 the termination criterion and method for designating the result of
the run.
The first two preparatory steps specify the ingredients that are available
to create the computer programs. A run of genetic programming
Automatically Defined Functions
Human computer programmers organize sequences of reusable steps
into subroutines. They then repeatedly invoke the subroutines—typically
with different instantiations of the subroutine’s dummy variables (formal
parameters). Reuse eliminates the need to “reinvent the wheel” on
each occasion when a particular sequence of steps may be useful. Reuse
makes it possible to exploit a problem’s modularities, symmetries, and
regularities (and thereby potentially accelerate the problem-solving process).
Programmers commonly organize their subroutines into hierarchies.
The automatically defined function (ADF) is one of the mechanisms
by which genetic programming implements the parametrized reuse and
hierarchical invocation of evolved code. Each ADF resides in a separate
function-defining branch within the overall multi-part computer program
(see Fig. 5.3). When ADFs are being used, a program consists of one
(or more) function-defining branches (i.e. ADFs) as well as one or more
main result-producing branches. An ADF may possess zero, one, or more
dummy variables (formal parameters). The body of an ADF contains its
Automatically Defined Iterations, Loops, Recursions and Stores
Automatically defined iterations, automatically defined loops, and automatically
defined recursions provide means (in addition to ADFs) to
reuse code.
Automatically defined stores provide means to reuse the result of executing
code.
Automatically defined iterations, automatically defined loops, automatically
defined recursions, and automatically defined stores are described
in Genetic Programming III: Darwinian Invention and Problem
Solving (Koza et al., 1999a
Genetic Programming Problem Solver
The Genetic Programming Problem Solver (GPPS) is described in
Koza et al. (1999a, Part 4).
If GPPS is being used, the user is relieved of performing the first
and second preparatory steps (concerning the choice of the terminal set
and the function set). The function set for GPPS consists of the four
basic arithmetic functions (addition, subtraction, multiplication, and
division) and a conditional operator (i.e. functions found in virtually
every general-purpose digital computer that has ever been built). The
terminal set for GPPS consists of numerical constants and a set of input
terminals that are presented in the form of a vector.
By employing this generic function set and terminal set, GPPS reduces
the number of preparatory steps from five to three.
GPPS relies on the architecture-altering operations to dynamically
create, duplicate, and delete subroutines and loops during the run of genetic
programming. Additionally, in version 2.0 of GPPS, the architecturealtering
operations are used to dynamically create, duplicate, and delete
recursions and internal storage. Because the architecture of the evolving
program is automatically determined during the run, GPPS eliminates
the need for the user to specify in advance whether to employ subroutines,
loops, recursions, and internal storage in solving a given problem
Developmental Genetic Programming
Developmental genetic programming is used for problems of synthesizing
analog electrical circuits, as described in Part 5 of Koza et al.
(1999a). When developmental genetic programming is used, a complex
structure (such as an electrical circuit) is created from a simple initial
structure (the embryo).
HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC PROGRAMMING
Samuel’s statement (quoted above) reflects the goal articulated by the
pioneers of the 1950s in the fields of artificial intelligence and machine
learning, namely to use computers to automatically produce human-like
results. Indeed, getting machines to produce human-like results is the
reason for the existence of the fields of artificial intelligence and machine
learning.
To make the notion of human-competitiveness more concrete, we say
that a result is “human-competitive” if it satisfies one or more of the
eight criteria in Table 5.1.
As can seen from Table 5.1, the eight criteria have the desirable attribute
of being at arms-length from the fields of artificial intelligence,
machine learning, and genetic programming. That is, a result cannot
acquire the rating of “human competitive” merely because it is endorsed
by researchers inside the specialized fields that are attempting to create
machine intelligence. Instead, a result produced by an automated
method must earn the rating of “human competitive” independent of
the fact that it was generated by an automated method.
Table 5.2 lists the 36 human-competitive instances (of which we are
aware) where genetic programming has produced human-competitive results.
Each entry in the table is accompanied by the criteria (from Table
5.1) that establish the basis for the claim of human-competitiveness.
There are now 23 instances where genetic programming has duplicated
the functionality of a previously patented invention, infringed a previously
patented invention, or created a patentable new invention (see criterion
A in Table 5.1). Specifically, there are 15 instances where genetic
programming has created an entity that either infringes or duplicates
the functionality of a previously patented twentieth-century invention,
six instances where genetic programming has done the same with respect
to an invention patented after January 1, 2000, and two instances
where genetic programming has created a patentable new invention. The
two new inventions are general-purpose controllers that outperform controllers
employing tuning rules that have been in widespread use in industry
for most of the twentieth century.
GENETIC PROGRAMMING THEORY
Genetic programming is a search technique that explores the space of
computer programs. As discussed above, the search for solutions to a
problem starts from a group of points (random programs) in this search
space. Those points that are of above average quality are then used to
generate a new generation of points through crossover, mutation, reproduction
and possibly other genetic operations. This process is repeated
over and over again until a termination criterion is satisfied.
If we could visualize this search, we would often find that initially the
population looks a bit like a cloud of randomly scattered points, but
that, generation after generation, this cloud changes shape and moves in
the search space following a well defined trajectory. Because genetic programming
is a stochastic search technique, in different runs we would
observe different trajectories. These, however, would very likely show
very clear regularities to our eye that could provide us with a deep understanding
of how the algorithm is searching the program space for the
CONCLUSIONS
In his seminal 1948 paper entitled Intelligent Machinery, Turing identified
three ways by which human-competitive machine intelligence might
be achieved. In connection with one of those ways, Turing (1948) said:
There is the genetical or evolutionary search by which a combination of
genes is looked for, the criterion being the survival value.
Turing did not specify how to conduct the “genetical or evolutionary
search” for machine intelligence. In particular, he did not mention
the idea of a population-based parallel search in conjunction with sexual
recombination (crossover) as described in John Holland’s 1975 book
Adaptation in Natural and Artificial Systems. However, in his 1950 paper
Computing Machinery and Intelligence, Turing did point out that
We cannot expect to find a good child-machine at the first attempt.
One must experiment with teaching one such machine and see how well
it learns. One can then try another and see if it is better or worse.
There is an obvious connection between this process and evolution, by
the identifications. . .
GENETIC PROGRAMMING xxxiii
Structure of the child machine = Hereditary material
Changes of the child machine = Mutations
Natural selection = Judgment of the experimenter.
That is, Turing perceived in 1948 and 1950 that one possibly productive
approach to machine intelligence would involve an evolutionary
process in which a description of a computer program (the hereditary
material) undergoes progressive modification (mutation) under the guidance
of natural selection (i.e. selective pressure in the form of what we
now call “fitness”).
Today, many decades later, we can see that indeed Turing was right.
Genetic programming has started fulfilling Turing’s dream by providing
us with a systematic method, based on Darwinian evolution, for getting
computers to automatically solve hard real-life problems. To do so, it
simply requires a high-level statement of what needs to be done (and
enough computing power).
Turing also understood the need to evaluate objectively the behavior
exhibited by machines, to avoid human biases when assessing their
intelligence. This led him to propose an imitation game, now know as
the Turing test for machine intelligence, whose goals are wonderfully
summarized by Arthur Samuel’s position statement quoted in the introduction
to this chapter.
At present, genetic programming is certainly not in a position to produce
computer programs that would pass the full Turing test for machine
intelligence, and it might not be ready for this immense task for centuries.
Nonetheless, thanks to the constant technological improvements
in genetic programming technology, in its theoretical foundations and
in computing power, genetic programming has been able to solve tens
of difficult problems with human-competitive results (see Table 5.2) in
the recent past. These are a small step towards fulfilling Turing and
Samuel’s dreams, but they are also early signs of things to come. It is,
indeed, arguable that in a few years’ time genetic programming will be
able to routinely and competently solve important problems for us in a
variety of specific domains of application, even when running on a single
personal computer, thereby becoming an essential collaborator for many
human activities. This, we believe, will be a remarkable step forward
towards achieving true, human-competitive machine intelligence.