Application of artificial intelligence for detecting derived viruses.
Asiru, Omotayo Fausat.
MetadataShow full item record
A lot of new viruses are being created each and every day. However, some of these viruses are not completely new per se. Most of the supposedly ‘new’ viruses are not necessarily created from scratch with completely new (something novel that has never been seen before) mechanisms. For example, some of these viruses just change their forms and come up with new signatures to avoid detection. Hence, such viruses cannot be argued to be new. This research refers to such as derived viruses. Just like new viruses, we argue that derived viruses are hard to detect with current scanning-detection methods. Many virus detection methods exist in the literature, but very few address the detection of derived viruses. Hence, the ultimate research question that this study aims to answer is; how might we improve the detection rate of derived computer viruses? The proposed system integrates a mutation engine together with a neural network to detect derived viruses. Derived viruses come from existing viruses that change their forms. They do so by adding some irrelevant instructions that will not alter the intended purpose of the virus. A mutation engine is used to group existing virus signatures based on their similarities. The engine then creates derivatives of groups of signatures. This is done up until the third generation (of mutations). The existing virus signatures and the created derivatives are both used to train the neural network. The derived signatures that are not used for the training are used to determine the effectiveness of the neural network. Ten experiments were conducted on each of the three derived virus generations. The first generation showed the highest derived virus detection rate compared to the other two generations. The second generation also showed a slightly higher detection rate than the third generation which has the least detection rate. Experimental results show that the proposed model can detect derived viruses with an average accuracy detection rate of 80% (This includes a 91% success rate on first generation, 83% success rate on second generation and 65% success rate on third generation). The results further show that the correlation between the original virus signature and its derivatives decreases with the generations. This means that after many generations of a virus changing form, its variants will no longer look like the original. Instead the variants look like a completely new virus even though the variants and the original virus will always have the same behaviour and operational characteristics with similar effects.