Show simple item record

dc.contributor.advisorAdewumi, Aderemi Oluyinka.
dc.creatorAndronicus, Ayobami Akinyelu.
dc.date.accessioned2019-02-21T13:06:31Z
dc.date.available2019-02-21T13:06:31Z
dc.date.created2014
dc.date.issued2014
dc.identifier.urihttp://hdl.handle.net/10413/16150
dc.descriptionMaster of Science in Computer Science. University of KwaZulu-Natal, Durban, 2014.en_US
dc.description.abstractElectronic fraud is one of the major challenges faced by the vast majority of online internet users today. Curbing this menace is not an easy task, primarily because of the rapid rate at which fraudsters change their mode of attack. Many techniques have been proposed in the academic literature to handle e-fraud. Some of them include: blacklist, whitelist, and machine learning (ML) based techniques. Among all these techniques, ML-based techniques have proven to be the most efficient, because of their ability to detect new fraudulent attacks as they appear.There are three commonly perpetrated electronic frauds, namely: email spam, phishing and network intrusion. Among these three, more financial loss has been incurred owing to phishing attacks. This research investigates and reports the use of MLand Nature Inspired technique in the domain of phishing detection, with the foremost objective of developing a dynamic and robust phishing email classifier with improved classification accuracy and reduced processing time.Two approaches to phishing email detection are proposed, and two email classifiers are developed based on the proposed approaches. In the first approach, a random forest algorithm is used to construct decision trees,which are,in turn,used for email classification. The second approach introduced a novel MLmethod that hybridizes firefly algorithm (FFA) and support vector machine (SVM). The hybridized method consists of three major stages: feature extraction phase, hyper-parameter selection phase and email classification phase. In the feature extraction phase, the feature vectors of all the features described in Section 3.6 are extracted and saved in a file for easy access.In the second stage, a novel hyper-parameter search algorithm, developed in this research, is used to generate exponentially growing sequence of paired C and Gamma (γ) values. FFA is then used to optimize the generated SVM hyper-parameters and to also find the best hyper-parameter pair. Finally, in the third phase, SVM is used to carry out the classification. This new approach addresses the problem of hyper-parameter optimization in SVM, and in turn, improves the classification speed and accuracy of SVM. Using two publicly available email datasets, some experiments are performed to evaluate the performance of the two proposed phishing email detection techniques. During the evaluation of each approach, a set of features (well suited for phishing detection) are extracted from the training dataset and used to constructthe classifiers. Thereafter, the trained classifiers are evaluated on the test dataset. The evaluations produced very good results. The RF-based classifier yielded a classification accuracy of 99.70%, a FP rate of 0.06% and a FN rate of 2.50%. Also, the hybridized classifier (known as FFA_SVM) produced a classification accuracy of 99.99%, a FP rate of 0.01% and a FN rate of 0.00%.en_US
dc.language.isoen_ZAen_US
dc.subjectTheses Computer science.en_US
dc.subject.otherPhising.en_US
dc.subject.otherEmail phising detection.en_US
dc.subject.otherMachine learning algorithms.en_US
dc.titleImproved techniques for phishing email detection based on random forest and firefly-based support vector machine learning algorithms.en_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record