A perspective on incomplete data in longitudinal multi-arm clinical trials, with emphasis on pattern-mixture-model based methodology.
Date
2014
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Missing data are common in longitudinal clinical trials. Rubin described three different missing
data mechanisms based on the level of dependence between the missing data process and the
measurement process. These are missing completely at random (MCAR), missing at random
(MAR) and missing not at random (MNAR). Data are MCAR when the probability of dropout
is independent of both observed and unobserved data. Data are MAR when the probability of
data being missing does not depend on the unobserved data, conditional on the observed data.
When neither MCAR nor MAR is valid, data are MNAR.
The aim of this thesis is to discuss statistical methodology required for analysing missing
outcome data and provide valid statistical methods for the MAR, MCAR and MNAR scenarios.
This thesis does not focus on data analysis where covariate data are missing. Under MCAR
complete and available case analyses are valid. When data are MAR multiple imputation,
likelihood-based models, inverse probability weighting and Bayesian models are valid. When
data are MNAR pattern-mixture, selection and shared-parameter models are valid. These
methods are illustrated by an in depth analysis of two data sets with missing data.
The first data set is the SAPiT trial an open label, randomised controlled trial in HIVtuberculosis
co-infected patients. Patients were randomised to three arms; each initiating
antiretroviral therapy at a different time. CD4+ count, an indication of HIV progression, was
measured at baseline and every 6 months for 24 months. The primary question was whether
CD4+ count trajectory over time differed for the three treatment arms. The assumption that
missing data are MCAR was not supported by the observed data. We performed a range of
sensitivity analyses under both MAR and MNAR assumptions.
The second data set is a placebo-controlled, randomised clinical trial conducted for 8 weeks to
determine the effectiveness of hypericum or sertraline in reducing depression, measured by the
Hamilton depression scale. The trial randomised 340 participants, with 28% lost to follow-up
before Week 8. We performed a sensitivity analysis under different assumptions about the
missing data process. The missing data mechanism was not MCAR. Under MAR assumptions,
some of the sensitivity analyses found no difference between either of the treatment arms and
placebo, while some found a significant difference between sertraline and placebo, but not
between hypericum and placebo. This re-analysis contributed to the literature around the
effectiveness of St John’s Wort because it changed the conclusions of the original analysis.
Description
Ph. D. University of KwaZulu-Natal, Durban 2014.
Keywords
Missing observations (Statistics), Multivariate analysis., Information storage and retrieval systems--Statistics, Medical., Clinical trials--Reporting., Theses--Statistics.