Masters Degrees (Statistics)

Permanent URI for this collectionhttps://hdl.handle.net/10413/7127

Browse

Now showing 1 - 4 of 4

Classification of banking clients according to their loan default status using machine learning algorithms.
(2022) Reddy, Suveshnee.; Chifurira, Retius.; Zewotir, Temesgen Tenaw.
Loan lending has become crucial for both individuals and companies. For lending institutions, although profitable, it can be very risky due to clients defaulting on their loan agreement. Credit risk assessment is a critical process which is carried out by most lending institutions; it reduces the possibility of lending to clients who will default on their loan repayment, however, it does not eliminate the problem. Thus, a collections process which aims to retrieve unpaid debt is also necessary. With South Africa facing another recession, which was only worsened by the lockdown during the covid-19 pandemic, lending institutions can expect an increase in the number loan defaulters. To counter this increase, changes will have to be made to their policies and processes. Changes can be made to either the loan application procedures (e.g. credit risk assessment, affordability assessment et cetera) or the post disbursal procedures (e.g. collections processes). The aim of this study is to predict whether a client will default on his/her loan, using machine learning algorithms, in order to enhance the collection process of the financial institution under study, where default is defined as missing at least three payments in the first 12 months of the loan being granted. The logistic regression model, decision tree, random forest, support vector machine, Naïve Bayes classifier, k-nearest neighbours algorithm and the artificial neural network were fitted to the balanced dataset. In the researcher’s analysis, loan data from a South African financial institution were used for the period August 2019 to December 2019. Variables related to a client’s demographics, income, expenses and debt, as well as loan information, were included in the dataset. Exploratory data analysis (EDA) was utilised in order to analyse the dataset and summarise their main characteristics. To reduce the dimensionality of the dataset, two techniques were used, namely principal component analysis (PCA), which is also used to correct the data for multicollinearity, and feature selection (i.e., recursive feature elimination). Each model was fitted to the dataset using these two techniques, and the confusion matrix and metrics such balanced accuracy, true positive ratio, true negative ratio, AUC score and the Gini coefficient were used to evaluate the different models in order to determine which model performed the best and was most suited for this application problem. The results show that when using the PCA approach, the random forest model, which obtained a balanced accuracy score, true positive ratio and AUC score of 0.69, 0.74 and 0.74, respectively, performed the best. The random forest model also performed the best when using the feature selection technique, obtaining a balanced accuracy score, true positive ratio and AUC score of 0.69, 0.74 and 0.75, respectively. When comparing the random forest model using PCA to the random forest model using feature selection, the results showed a marginal difference between each performance metric analysed. The random forest model using PCA utilised 48 variables, whereas the random forest model using feature selection utilised only 18 variables and thus seemed to be more suitable for the classification problem under study. The results of this study are expected to benefit analysts and data scientists in financial institutions who would like to identify the robust machine learning algorithms for classifying defaulting clients. This study is also of significance to policy makers who would want to identify the risk factors associated with loan defaulting clients.
Estimation of the value at risk using a long-memory GARCH application to JSE Indices.
(2020) Khumalo, Moses Bhekinhlahla.; Chinhamu, Knowledge.; Chifurira, Retius.
Financial data are characterized by stylized facts; this makes it difficult to model financial assets if these stylized facts are not taken into account. Therefore, the implementation of accurate risk management tools such as value at risk (VaR), which is crucial in the management of market risk, becomes a futile exercise. This study aims to compare the performance of the long-memory GARCH-type models with heavy-tailed innovations in estimating the value at risk of the All Share Index, the Mining Index, and the Banking Index. This was achieved by investigating the empirical properties of the JSE Indices, fitting the FIGARCH, HYGARCH, and FIAPARCH with the Student’s t-distribution (STD), skewed Student’s t-distribution (SSTD), and generalized error distribution (GED). The study further estimates VaR for the short and long-trading positions on the 95th, 99th, and 99,7th quantiles, as well as backtests the results. The main findings indicate that the JSE All Share index returns is best captured by the FIGARCH-SSTD model, whereas the JSE Mining Index retuns most robust model is the FIAPARCH-STD model. For the JSE Banking Index returns, the FIAPARCH-STD model is predominantly appropriate at most of different VaR levels. The findings of the study provide a solution to both risk practitioners and asset managers for better understanding the behaviour of the financial indices’ returns. Finally, this can assist the role players in fastidiously managing risks and assets’ returns.
Modelling South Africa's market risk using the APARCH model and heavy-tailed distributions.
(2016) Ilupeju, Yetunde Elizabeth.; Chifurira, Retius.; Chinhamu, Knowledge.; Murray, Michael.
Estimating Value-at-risk (VaR) of stock returns, especially from emerging economies has recently attracted attention of both academics and risk managers. This is mainly because stock returns are relatively more volatile than its historical trend. VaR and other risk management tools, such as expected shortfall (conditional VaR) are highly dependent on an appropriate set of underlying distributional assumptions being made. Thus, identifying a distribution that best captures all aspects of financial returns is of great interest to both academics and risk managers. As a result, this study compares the relative performance of the GARCH-type model combined with heavy-tailed distribution, namely Skew Student t distribution, Pearson Type IV distribution (PIVD), Generalized Pareto distribution (GPD), Generalized Extreme Value distribution (GEVD), and stable distribution in estimating Value-at-Risk of South African all share index (ALSI) returns. Model adequacy is checked through the backtesting procedure. The Kupiec likelihood ratio test is used for backtesting. The proposed models are able to capture volatility clustering (conditional heteroskedasticity), and the asymmetric effect (leverage effect) and heavy-tailedness in the returns. The advantage of the proposed models lies in their ability to capture volatility clustering and the leverage effect on the returns, though the GARCH framework and at the same time model their heavy tailed behaviour through the heavy-tailed distribution. The main findings indicate that APARCH model combined with this heavy-tailed distribution performed well in modelling South African market’s risk at both the long and short position. It was also found that when compared in terms of their predictive ability, APARCH model combined with the PIVD, and APARCH model combined with GPD model gives a better VaR estimation for the short position while APARCH model combined with stable distribution give the better VaR estimation for long position. Thus, APARCH model combined with heavy-tailed distribution model provides a good alternative for modelling stock returns. The outcomes of this research are expected to be of salient value to financial analysts, portfolio managers, risk managers and financial market researchers, therefore giving a better understanding of the South African market.
Multivariate elliptically contoured stable distributions with applications to BRICS financial data.
(2016) Naradh, Kimera.; Chinhamu, Knowledge.; Hammujuddy, Mohammad Jahvaid.; Chifurira, Retius.
Brazil, Russia, India, China and South Africa (BRICS) are regarded as the ve major emerging economies where all members are a part of a select group of developing industrialized countries. In the nancial industry, various models are used for the description and analysis of nancial trends. One of these models is the family of stable distributions which takes into account the skewness and heavy tails that are frequent in nancial data. The main objective of this study is to investigate the t of stable distributions for exchange rates of each of the BRICS countries against the U.S. Dollar in both the univariate and multivariate cases. The data set consists of exchange rate data from the period January 2011 to January 2016. Nolan's S0 -parameterization stable distribution was tted using the maximum likelihood method in the univariate case and in a tted stable model where a GARCH (1,1) lter was applied to the returns (Stable-GARCH(1,1)). The Kolmogorov-Smirnov test and the Anderson-Darling test show that stable distributions adequately t the returns of BRICS nancial data. Value-at-Risk (VaR) calculations and VaR in-sample backtesting using the Kupiec likelihood ratio test and the Christo ersen's conditional coverage test were applied as per the International Basel Regulatory where the robustness of each model describing the nancial data was evaluated. Thereafter, we proceeded to t bivariate elliptical stable models using the Rachev-Xin-Cheng method after visualizing the scatterplot matrix of BRICS countries. This study validates the usefulness of stable distributions for modelling BRICS nancial data.

Browse

Browsing Masters Degrees (Statistics) by Author "Chifurira, Retius."