Flexible statistical modeling of deaths by diarrhoea in South Africa.
dc.contributor.advisor | Ramroop, Shaun. | |
dc.contributor.advisor | Mwambi, Henry G. | |
dc.contributor.author | Mbona, Sizwe Vincent. | |
dc.date.accessioned | 2013-12-17T11:00:43Z | |
dc.date.available | 2013-12-17T11:00:43Z | |
dc.date.created | 2013 | |
dc.date.issued | 2013 | |
dc.description | Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2013. | en |
dc.description.abstract | The purpose of this study is to investigate and understand data which are grouped into categories. Various statistical methods was studied for categorical binary responses to investigate the causes of death from diarrhoea in South Africa. Data collected included death type, sex, marital status, province of birth, province of death, place of death, province of residence, education status, smoking status and pregnancy status. The objective of this thesis is to investigate which of the above explanatory variables was most affected by diarrhoea in South Africa. To achieve this objective, different sample survey data analysis techniques are investigated. This includes sketching bar graphs and using several statistical methods namely, logistic regression, surveylogistic, generalised linear model, generalised linear mixed model, and generalised additive model. In the selection of the fixed effects, a bar graph is applied to the response variable individual profile graphs. A logistic regression model is used to identify which of the explanatory variables are more affected by diarrhoea. Statistical applications are conducted in SAS (Statistical Analysis Software). Hosmer and Lemeshow (2000) propose a statistic that they show, through simulation, is distributed as chi‐square when there is no replication in any of the subpopulations. Due to the similarity of the Hosmer and Lemeshow test for logistic regression, Parzen and Lipsitz (1999) suggest using 10 risk score groups. Nevertheless, based on simulation results, May and Hosmer (2004) show that, for all samples or samples with a large percentage of censored observations, the test rejects the null hypothesis too often. They suggest that the number of groups be chosen such that G=integer of {maximum of 12 and minimum of 10}. Lemeshow et al. (2004) state that the observations are firstly sorted in increasing order of their estimated event probability. | en |
dc.identifier.uri | http://hdl.handle.net/10413/10239 | |
dc.language.iso | en_ZA | en |
dc.subject | Statistics--Mathematics. | en |
dc.subject | Mathematical statistics. | en |
dc.subject | Statistics--Data processing. | en |
dc.subject | Linear models (Statistics) | en |
dc.subject | Theses--Statistics and actuarial science. | en |
dc.title | Flexible statistical modeling of deaths by diarrhoea in South Africa. | en |
dc.type | Thesis | en |