Repository logo
 

Data classification using genetic programming.

dc.contributor.advisorPillay, Nelishia.
dc.contributor.authorDufourq, Emmanuel.
dc.date.accessioned2016-09-26T07:43:40Z
dc.date.available2016-09-26T07:43:40Z
dc.date.created2015
dc.date.issued2015
dc.descriptionMaster of Science in Computer Science.en_US
dc.description.abstractGenetic programming (GP), a field of artificial intelligence, is an evolutionary algorithm which evolves a population of trees which represent programs. These programs are used to solve problems. This dissertation investigates the use of genetic programming for data classification. In machine learning, data classification is the process of allocating a class label to an instance of data. A classifier is created in order to perform these allocations. Several studies have investigated the use of GP to solve data classification problems. These studies have shown that GP is able to create classifiers with high classification accuracies. However, there are certain aspects which have not previously been investigated. Five areas were investigated in this dissertation. The first was an investigation into how discretisation could be incorporated into a GP algorithm. An adaptive discretisation algorithm was proposed, and outperformed certain existing methods. The second was a comparison of GP representations for binary data classification. The findings indicated that from the representations examined (arithmetic trees, decision trees, and logical trees), the decision trees performed the best. The third was to investigate the use of the encapsulation genetic operator and its effect on data classification. The findings revealed that an improvement in both training and test results was achieved when encapsulation was incorporated. The fourth was an investigative analysis of several hybridisations of a GP algorithm with a genetic algorithm in order to evolve a population of ensembles. Four methods were proposed and these methods outperformed certain existing GP and ensemble methods. Finally, the fifth area was to investigate an ensemble construction method for classification. In this approach GP evolved a single ensemble. The proposed method resulted in an improvement in training and test accuracy when compared to the standard GP algorithm. The methods proposed in this dissertation were tested on publicly available data sets, and the results were statistically tested in order to determine the effectiveness of the proposed approaches.en_US
dc.identifier.urihttp://hdl.handle.net/10413/13386
dc.language.isoen_ZAen_US
dc.subjectBig data--Classification.en_US
dc.subjectGenetic programming (Computer science)en_US
dc.subjectTheses--Computer science.en_US
dc.titleData classification using genetic programming.en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dufourq_Emmanuel_2015.pdf
Size:
1.94 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.64 KB
Format:
Item-specific license agreed upon to submission
Description: