• Login
    View Item 
    •   ResearchSpace Home
    • College of Agriculture, Engineering and Science
    • School Mathematics, Statistics and Computer Science
    • Statistics
    • Doctoral Degrees (Statistics)
    • View Item
    •   ResearchSpace Home
    • College of Agriculture, Engineering and Science
    • School Mathematics, Statistics and Computer Science
    • Statistics
    • Doctoral Degrees (Statistics)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    The management of missing categorical data : comparison of multiple imputation and subset correspondence analysis.

    Thumbnail
    View/Open
    Thesis. (4.916Mb)
    Date
    2015
    Author
    Hendry, Gillian Margaret.
    Metadata
    Show full item record
    Abstract
    Missing data is a common problem in research and the manner in which this ‘missingness’ is managed, is crucial to the validity of analysis outcomes. This study illustrates the use of two diverse methods to handle, in particular, missing categorical data. These methods are applied to a set of data which intended to identify relationships between asthma severity in children and environmental, behavioural, genetic and socio-economic factors. This dataset suffered from substantial missingness. The first method involved the application of two approaches to multiple imputation, each adopting different distributional specifications. A practical challenge, previously undocumented, was encountered in the application of multiple imputation when interactions, to be identified and included in the analysis model, were needed for the imputation model. This study found that by imputing a single set of complete data using the expectation maximization (EM) algorithm for covariance matrices, it was possible to identify relevant interactions for inclusion in the imputation model. The second method illustrated the application of correspondence analysis to a subset of the data that includes only the measured data categories. The application of subset correspondence analysis (s-CA) with incomplete data, as well as its sensitivity to the type of missingness, has not been well documented, if at all. There is also no evidence of research in which interactions have been added to an analysis with s-CA. In this study its use, both with and without interactions, was illustrated and the results, when compared to those from the multiple imputation approach, were found to be similar and favourably complementary. A simulation study found that s-CA performed well with any type of missingness, provided the amount of missingness is less than 30% on any variable with incomplete data. Across all analyses, relationships found between asthma severity and factors were consistent with known relationships, thus providing confirmation of the reliability of the methods.
    URI
    http://hdl.handle.net/10413/15643
    Collections
    • Doctoral Degrees (Statistics)

    DSpace software copyright © 2002-2013  Duraspace
    Contact Us | Send Feedback
    Theme by 
    @mire NV
     

     

    Browse

    All of ResearchSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsAdvisorsTypeThis CollectionBy Issue DateAuthorsTitlesSubjectsAdvisorsType

    My Account

    LoginRegister

    DSpace software copyright © 2002-2013  Duraspace
    Contact Us | Send Feedback
    Theme by 
    @mire NV