Doctoral Degrees (Computer Science)
Permanent URI for this collectionhttps://hdl.handle.net/10413/7113
Browse
Browsing Doctoral Degrees (Computer Science) by SDG "SDG4"
Now showing 1 - 5 of 5
- Results Per Page
- Sort Options
Item Deep learning for brain tumor segmentation and survival prediction.(2024) Magadza, Tirivangani Batanai Hendrix Takura.; Viriri, Serestina.A brain tumor is an abnormal growth of cells in the brain that multiplies uncontrolled. The death of people due to brain tumors has increased over the past few decades. Early diagnosis of brain tumors is essential in improving treatment possibilities and increasing the survival rate of patients. The life expectancy of patients with glioblastoma multiforme (GBM), the most malignant glioma, using the current standard of care is, on average, 14 months after diagnosis despite aggressive surgery, radiation, and chemotherapies. Despite considerable efforts in brain tumor segmentation research, patient diagnosis remains poor. Accurate segmentation of pathological regions may significantly impact treatment decisions, planning, and outcome monitoring. However, the large spatial and structural variability among brain tumors makes automatic segmentation a challenging problem, leaving brain tumor segmentation an open challenge that warrants further research endeavors. While several methods automatically segment brain tumors, deep learning methods are becoming widespread in medical imaging due to their resounding performance. However, the boost in performance comes at the cost of high computational complexity. Therefore, to improve the adoption rate of computer-assisted diagnosis in clinical setups, especially in developing countries, there is a need for more computational and memoryefficient models. In this research, using a few computational resources, we explore various techniques to develop deep learning models accurately for segmenting the different glioma sub-regions, namely the enhancing tumor, the tumor core, and the whole tumor. We quantitatively evaluate the performance of our proposed models against the state-of-the-art methods using magnetic resolution imaging (MRI) datasets provided by the Brain Tumor Segmentation (BraTS) Challenge. Lastly, we use segmentation labels produced by the segmentation task and MRI multimodal data to extract appropriate imaging/radiomic features to train a deep learning model for overall patient survival prediction.Item Deep learning framework for speech emotion classification.(2024) Akinpelu, Samson Adebisi.; Viriri, Serestina.A robust deep learning-based approach for the recognition and classification of speech emotion is proposed in this research work. Emotion recognition and classification occupy a conspicuous position in human-computer interaction (HCI) and by extension, determine the reasons and justification for human action. Emotion plays a critical role in decision-making as well. Distinguishing among various emotions (angry, sad, happy, neutral, disgust, fear, and surprise) that exist from speech signals has however been a long-term challenge. There have been some limitations associated with existing deep learning techniques as a result of the complexity of features from human speech (sequential data) which consists of insufficient label datasets, Noise and Environmental Factors, Cross-cultural and Linguistic Differences, Speakers’ Variability and Temporal Dynamics. There is also a heavy reliance on huge parameter tunning, especially for millions of parameters before the model can learn the expected emotional features necessary for classification emotion, which often results in computational complexity, over-fitting, and poor generalization. This thesis presents an innovative deep learning framework-based approach for the recognition and classification of speech emotions. The deep learning techniques currently in use for speech-emotion classification are exhaustively and analytically reviewed in this thesis. This research models various approaches and architectures based on deep learning to build a framework that is dependable and efficient for classifying emotions from speech signals. This research proposes a deep transfer learning model that addresses the shortcomings of inadequate training datasets for the classification of speech emotions. The research also models advanced deep transfer learning in conjunction with a feature selection algorithm to obtain more accurate results regarding the classification of speech emotion. Speech emotion classification is further enhanced by combining the regularized feature selection (RFS) techniques and attention-based networks for the classification of speech emotion with a significant improvement in the emotion recognition results. The problem of misclassification of emotion is alleviated through the selection of salient features that are relevant to emotion classification from speech signals. By combining regularized feature selection with attention-based mechanisms, the model can better understand emotional complexities and outperform conventional ML model emotion detection algorithms. The proposed approach is very resilient to background noise and cultural differences, which makes it suitable for real-world applications. Having investigated the reasons behind the enormous computing resources required for many deep learning based methods, the research proposed a lightweight deep learning approach that can be deployed on low-memory devices for speech emotion classification. A redesigned VGGNet with an overall model size of 7.94MB is utilized, combined with the best-performing classifier (Random Forest). Extensive experiments and comparisons with other deep learning models (DenseNet, MobileNet, InceptionNet, and ResNet) over three publicly available speech emotion datasets show that the proposed lightweight model improves the performance of emotion classification with minimal parameter size. The research further devises a new method that minimizes computational complexity using a vision transformer (ViT) network for speech emotion classification. The ViT model’s capabilities allow the mel-spectrogram input to be fed into the model, allowing for the capturing of spatial dependencies and high-level features from speech signals that are suitable indicators of emotional states. Finally, the research proposes a novel transformer model that is based on shift-window for efficient classification of speech emotion on bi-lingual datasets. Because this method promotes feature reuse, it needs fewer parameters and works well with smaller datasets. The proposed model was evaluated using over 3000 speech emotion samples from the publicly available TESS, EMODB, EMOVO, and bilingual TESS-EMOVO datasets. The results showed 98.0%, 98.7%, and 97.0% accuracy, F1-Score, and precision, respectively, across the 7 classes of emotion.Item Exploration of ear biometrics with deep learning.(2024) Booysens, Aimee Anne.; Viriri, Serestina.Biometrics is the recognition of a human using biometric characteristics for identification, which may be physiological or behavioural. Numerous models have been proposed to distinguish biometric traits used in multiple applications, such as forensic investigations and security systems. With the COVID-19 pandemic, facial recognition systems failed due to users wearing masks; however, human ear recognition proved more suitable as it is visible. This thesis explores efficient deep learning-based models for accurate ear biometrics recognition. The ears were extracted and identified from 2D profiles and facial images, focusing on both left and right ears. With the numerous datasets used, with particular mention of BEAR, EarVN1.0, IIT, ITWE and AWE databases. Many machine learning techniques were explored, such as Naïve Bayes, Decision Tree, K-Nearest Neighbor, and innovative deep learning techniques: Transformer Network Architecture, Lightweight Deep Learning with Model Compression and EfficientNet. The experimental results showed that the Transformer Network achieved a high accuracy of 92.60% and 92.56% with epochs of 50 and 90, respectively. The proposed ReducedFireNet Model reduces the input size and increases computation time, but it detects more robust ear features. The EfficientNet variant B8 achieved a classification accuracy of 98.45%. The results achieved are more significant than those of other works, with the highest achieved being 98.00%. The overall results showed that deep learning models can improve ear biometrics recognition when both ears are computed.Item Forest image classification based on deep learning and ontologies.(2024) Kwenda, Clopas.; Gwetu, Mandlenkosi Victor.; Fonou-Dombeu, Jean Vincent.Forests contribute abundantly to nature’s natural resources and they significantly contribute to a wide range of environmental, socio-cultural, and economic benefits. Classifications of forest vegetation offer a practical method for categorising information about patterns of forest vegetation. This information is required to successfully plan for land use, map landscapes, and preserve natural habitats. Remote sensing technology has provided high spatio-temporal resolution images with many spectral bands that make conducting research in forestry easy. In that regard, artificial intelligence technologies assess forest damage. The field of remote sensing research is constantly adapting to leverage newly developed computational algorithms and increased computing power. Both the theory and the practice of remote sensing have significantly changed as a result of recent technological advancements, such as the creation of new sensors and improvements in data accessibility. Data-driven methods, including supervised classifiers (such as Random Forests) and deep learning classifiers, are gaining much importance in processing big earth observation data due to their accuracy in creating observable images. Though deep learning models produce satisfactory results, researchers find it difficult to understand how they make predictions because they are regarded as black-box in nature, owing to their complicated network structures. However, when inductive inference from data learning is taken into consideration, data-driven methods are less efficient in working with symbolic information. In data-driven techniques, the specialized knowledge that environmental scientists use to evaluate images obtained through remote sensing is typically disregarded. This limitation presents a significant obstacle for end users of Earth Observation applications who are accustomed to working with symbolic information, such as ecologists, agronomists, and other related professionals. This study advocates for the incorporation of ontologies in forest image classification owing to their ability in representing domain expert knowledge. The future of remote sensing science should be supported by knowledge representation techniques such as ontologies. The study presents a methodological framework that integrates deep learning techniques and ontologies with the aim of enhancing domain expert confidence as well as increasing the accuracy of forest image classification. In addressing this challenge, this study followed the following systematic steps (i) A critical review of existing methods for forest image classification (ii) A critical analysis of appropriate methods for forest image classification (iii) Development of the state-of-the-art model for forest image segmentation (iv) Design of a hybrid model of deep learning and machine learning model for forest image classification (v) A state-of-the-art ontological framework for forest image classification. The ontological framework was flexible to capture the expression of the domain expert knowledge. The ontological state-of-the-art model performed well as it achieved a classification accuracy of 96%, with a Root Mean Square Error of 0.532. The model can also be used in the fruit industry and supermarkets to classify fruits into their respective categories. It can also be potentially used to classify trees with respect to their species. As a way of enhancing confidence in deep learning models by domain experts, the study recommended the adoption of explainable artificial intelligence (XAI) methods because they unpack the process by which deep learning models reach their decision. The study also recommended the adoption of high-resolution networks (HRNets) as an alternative to traditional deep learning models, because they can convert low-resolution representation to high-resolution and have efficient block structures developed according to new standards and they are excellent at being used for feature extraction.Item Pancreatic cancer survival prediction using Deep learning techniques.(2023) Bakasa, Wilson.; Viriri, Serestina.Abstract available in PDF.