Multi-level parallelization for accurate and fast medical image retrieval image retrieval.

Chikamai, Keith Sasala.

Multi-level parallelization for accurate and fast medical image retrieval image retrieval.

Files

Chikamai_Keith_S_2016.pdf (1.62 MB)

Date

2016

Authors

Chikamai, Keith Sasala.

Abstract

Breast cancer is the most prevalent form of cancer diagnosed in women. Mammograms offer the best option in detecting the disease early, which allows early treatment and by implication, a favorable prognosis. Content-based Medical Image Retrieval (CBMIR) technique is increasingly gaining research attention as a Computer Aided Diagnosis (CAD)) approach for breast cancer diagnosis. Such systems work by availing mammogram images that are pathologically similar to a given query example, which are used to support the diagnostic decision by referential basis. In most cases, the query is of the form “return k images similar to the specified query image”. Similarity in the Content-based Image Retrieval (CBIR) context is based on the content of images, rather than text or keywords. The essence of CBIR systems is to enable indexing of pictorial content in databases and eliminating the drawbacks of manual annotation. CBMIR is a relatively young technology that is yet to gain widespread use. One major challenge for CBMIR systems is bridging the “semantic gap” in the description of image content. Semantic gap describes the discord in the notion of similarity between the descriptions of humans and CBMIR systems. Low accuracy concerns inhibit the full adoption of CBMIR systems into regular practice, with research focusing on improving the accuracy of CBMIR systems. Nonetheless, the area is still an open problem. As a contribution towards improving the accuracy of CBMIR for mammogram images, this work proposes a novel feature modeling technique for CBMIR systems based on classifier scores and standard statistical calculations on the same. A set of gradient-based filters are first used to highlight possible calcification objects; an Entropy-based thresholding technique is then used to segment the calcifications from the background. Experimental results show that the proposed model achieves a 100% detection rate, which shows the effectiveness of combining the likelihood maps from various filters in detecting calcification objects. Feature extraction considers established textural and geometric features, which are calculated from the detected calcification objects; these are then used to generate secondary features using the Support Vector Machine and Quadratic Discriminant Analysis classifier. The model is validated through a range of benchmarks, and is shown to perform competitively in comparison to similar works. Specifically, it scores 95%, 82%, 78%, and 98% on the accuracy, positive predictive value, sensitivity and specificity benchmarks respectively. Parallel computing is applied to the task of feature extraction to show its viability in reducing the cost of extraction features. This research considers two technologies for implementation: distributed computing using the message passing interface (MPI) and multicore computing using OpenMP threads. Both technologies involve the division of tasks to facilitate sharing of the computational burden in order to reduce the overall time cost. Communication cost is one penalty implied with parallel systems and a significant design target where efficiency of parallel models is concerned. This research focuses on mitigating the communication overhead for increasing the efficacy of parallel computation; it proposes an adaptive task assignment model dependent on network bandwidth for the parallel extraction of features. Experimental results report speedup values of between 4:7x and 10:4x, and efficiency values of between 0:11 and 0:62. There is a positive increase in both the speedup and efficiency values with an increase in the database size. The proposed adaptive assignment of tasks positively impacts on the speedup and efficiency performance of the parallel model. All experiments are based on the mammographic image analysis society (MIAS) database, which is a publicly available database that has been widely used in related works. The results achieved for both the mammogram pathology-based retrieval model as well as its computational efficiency met the objectives set for the research. In the domain of breast cancer applications, the models proposed in this work should positively contribute to the improvement of retrieval results of computer aided diagnosis/detection systems, where applicable. The improved accuracy will lead to higher acceptability of such systems by radiologists, which will enhance the quality of diagnosis both by reducing the decision-making time as well as improving the accuracy of the entire diagnostic process.