Data classification using genetic programming.

Dufourq, Emmanuel.

Data classification using genetic programming.

dc.contributor.advisor	Pillay, Nelishia.
dc.contributor.author	Dufourq, Emmanuel.
dc.date.accessioned	2016-09-26T07:43:40Z
dc.date.available	2016-09-26T07:43:40Z
dc.date.created	2015
dc.date.issued	2015
dc.description	Master of Science in Computer Science.	en_US
dc.description.abstract	Genetic programming (GP), a field of artificial intelligence, is an evolutionary algorithm which evolves a population of trees which represent programs. These programs are used to solve problems. This dissertation investigates the use of genetic programming for data classification. In machine learning, data classification is the process of allocating a class label to an instance of data. A classifier is created in order to perform these allocations. Several studies have investigated the use of GP to solve data classification problems. These studies have shown that GP is able to create classifiers with high classification accuracies. However, there are certain aspects which have not previously been investigated. Five areas were investigated in this dissertation. The first was an investigation into how discretisation could be incorporated into a GP algorithm. An adaptive discretisation algorithm was proposed, and outperformed certain existing methods. The second was a comparison of GP representations for binary data classification. The findings indicated that from the representations examined (arithmetic trees, decision trees, and logical trees), the decision trees performed the best. The third was to investigate the use of the encapsulation genetic operator and its effect on data classification. The findings revealed that an improvement in both training and test results was achieved when encapsulation was incorporated. The fourth was an investigative analysis of several hybridisations of a GP algorithm with a genetic algorithm in order to evolve a population of ensembles. Four methods were proposed and these methods outperformed certain existing GP and ensemble methods. Finally, the fifth area was to investigate an ensemble construction method for classification. In this approach GP evolved a single ensemble. The proposed method resulted in an improvement in training and test accuracy when compared to the standard GP algorithm. The methods proposed in this dissertation were tested on publicly available data sets, and the results were statistically tested in order to determine the effectiveness of the proposed approaches.	en_US
dc.identifier.uri	http://hdl.handle.net/10413/13386
dc.language.iso	en_ZA	en_US
dc.subject	Big data--Classification.	en_US
dc.subject	Genetic programming (Computer science)	en_US
dc.subject	Theses--Computer science.	en_US
dc.title	Data classification using genetic programming.	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Dufourq_Emmanuel_2015.pdf
Size:: 1.94 MB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.64 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Degrees (Computer Science)