An investigation of genetic algorithms and genetic programming

Date

1996-06

Journal Title

Journal ISSN

Volume Title

Publisher

Department

Type

Thesis

ISSN

item.page.extent-format

Citation

Abstract

There are many regression techniques that try and fit known models to sets of data. For them we assume the functional form of the model and use analytic or statistical techniques to find the values of any unknown model parameters. If the target model is thought to be of sufficiently complex a form, the above techniques (i.e. analytic and statistical techniques) may fail to provide the desired results and alternative methods have to be used. This is even more important if the underlying model is itself unknown. Genetic algorithms and genetic programming are two techniques that may help in the search for suitable models. Unfortunately, however, both of these techniques have themselves parameters that need to be specified and there are no clear guidelines to aid such choice. A number of other implementation issues are also open questions and in this thesis we look at a number of ways of implementing genetic algorithms and genetic programs to evaluate alternatives. Simple target models are used throughout most of this work so that the effects of changes to the method's parameters can be monitored. We look at how population size, crossover probability and mutation rate affect the speed of convergence of the genetic algorithm to an acceptable model. One of the most difficult aspects of genetic programming is the issue of the meaning of the offspring produced by crossover or mutation. Some systems arrange that any offspring that do not have meaning are removed from the population. Others ensure that no such offspring can arise. In this work we look at what might happen if we always impose a meaning on all possible offspring. In the genetic programming part of this work we look at two representations of our models. In the first we used a fixed length representation, whilst in the second we used a tree to represent each member of the population. We also look at a number of fitness functions. The commonest such functions are based upon errors between the model and the data. For our fitness functions we also use their correlation coefficient. We found that a strategy that starts by using correlation coefficient and then a fitness that combines both correlation coefficient and error worked better.

Description

item.page.description-software

item.page.type-software-language

item.page.identifier-giturl

Keywords

Rights

item.page.relationships

item.page.relationships

item.page.relation-supplements