The goal of this paper is to describe approaches for the joint analysis of repeatedly measured data with time-to-event endpoints, first separately and then in the framework of a single comprehensive model, emphasizing the efficiency of the latter approach. Data from the Johnston County Osteoarthritis Project (JoCo OA) will be used as an example to investigate the relationship between the change in repeatedly measured body mass index (BMI) and the time-to-event endpoint of incident worsening of radiographic knee OA that was defined as an increased Kellgren-Lawrence (K-L) grade in at least one knee over time.

First, we provide an overview of the methods for analyzing repeated measurements and time-to-event endpoints separately. Then, we describe traditional (Cox proportional hazards model, CoxPH) and emerging (joint model, JM) approaches allowing combined analysis of repeated measures with a time-to-event endpoint in the framework of a single statistical model. Finally, we apply the models to JoCo OA data, and interpret and compare the results from the different approaches.

Applications of JM (but not CoxPH) showed that the risk of worsening radiographic OA is higher when BMI is higher or increasing, thus illustrating the advantages of JM for analyzing such dynamic measures in a longitudinal study.

Joint models are preferable for simultaneous analyses of repeated measurement and time-to-event outcomes, particularly in a chronic disease context, where dependency between the time-to-event endpoint and the longitudinal trajectory of repeated measurements is inherent.

Longitudinal studies in which data are collected on participants over years or even decades have become increasingly popular in many epidemiological fields. Such studies enable the analysis of individual-level changes, represented by repeatedly measured variables, and relate the changing patterns to the development of conditions or diseases causing disability and death. Despite the advantages of having multiple time points, there are several challenges associated with longitudinal data analysis, including non-ignorable missing data and sparse examination times (

In addition, to monitor risk factors and health outcomes, these studies collect repeated measurements that can encompass different types of variables. Two of these, longitudinally measured variables (e.g., biomarkers, patient-reported outcome measures) and the time to occurrence of an event (e.g., joint replacement, death), are very common in epidemiological studies. These two types of data are often analyzed separately, without considering that longitudinal and survival processes are related (

Investigation of such longitudinal relationships between repeatedly measured variables and the event of interest can provide clinically relevant information about the likely course of disease in a given person. For example, to optimize treatment strategies in early rheumatoid arthritis (RA), it is important to understand the relationship between disease activity over time, represented by longitudinal DAS-28 measurements, and time to subsequent radiographic joint damage. To evaluate the impact of the longitudinal response trajectory on the time-to-event outcome of interest

The main goals of this paper are to (1) describe mainstream statistical approaches for the analysis of such data, (2) convince the reader of the advantages of joint analysis of longitudinal measures with time-to-event outcomes, and (3) demonstrate how to apply these methods in a real and relevant dataset, using data from the Johnston County Osteoarthritis Project (JoCo OA). First, we review the methods for analyzing time-to-event and repeated measurements outcomes separately. Then we describe traditional (the Cox proportional hazards model) and emerging (joint model) approaches allowing combined analysis of repeated measures with time-to-event outcomes in the framework of a single, comprehensive statistical model. Finally, we apply the models to the JoCo OA data, and then interpret and compare the results from the different approaches.

The linear mixed effects model, or LMM, is a commonly used approach for analysis of repeated measurements (

When the main outcome under assessment is the length of the interval from the time origin until the occurrence of the event of interest (e.g., survival time until death), an appropriate methodology is required, as these data have unique properties that cannot be addressed with standard statistical procedures. First, methods based on the normal distribution are not applicable for the analysis of survival times because they tend to be positively skewed, leading to violation of the normality assumption. Second and even more important, it is common that at the end of a study the actual survival times will often be censored, i.e., they are not observed for all individuals. The most common type of censoring, and the focus in this paper, is “right censoring” that occurs when a participant does not experience the event of interest by the end of his/her study follow-up.

The Cox proportional hazards model (CoxPH) (

The CoxPH model can also be extended to incorporate important explanatory variables that do change over the follow-up time period (

However, the CoxPH with TVCs has limited ability to handle explanatory variables with fluctuation and measurement errors (

A joint model (JM) consists of two sub-models representing the dynamics of (a) the longitudinal sub-model and (b) the time-to-event sub-model, as reviewed elsewhere (

Although JMs are becoming increasingly popular in different epidemiologic fields such as oncology (

In our working example, we use repeatedly measured BMI, which is a useful indicator of obesity, to investigate the effect of the longitudinal trajectory of BMI on the time-to-event outcome of worsening K-L grade in the knee. We chose this relationship given that 1) obesity is one of the most important knee OA risk factors (

The data used in this paper were collected from non-Hispanic African American and Caucasian men and women enrolled in the JoCo OA which is an ongoing, longitudinal population-based prospective study with clearly defined and repeatedly measured radiographic OA, comorbidities, various biomarkers, socio-demographic and physiological variables (

The counting process form of the CoxPH model (

We fitted several JMs using the R package JM(

In the CoxPH model with TVCs, higher BMI was associated with higher risk of worsening knee rOA (HR per 5 kg/m^{2}, 1.49; 95% CI, 1.42–1.55). We also found, counter-intuitively, that increasing BMI over time was negatively associated with worsening rOA; specifically, the risk decreased by 8% for each 5% increase in BMI over time (HR per 5%, 0.92; 95% CI, 0.89–0.95).

The results for JM analysis are shown in

As previously mentioned, the corresponding coefficients can be interpreted in terms of percentage change rather than absolute change (see

JM of longitudinal and time-to-event data continues as an emerging area of statistical research. In this paper, we demonstrated the usefulness and interpretability of the JM approach in rheumatology using OA, which is the most common form of arthritis and a leading cause of disability among adults in the USA(

The JM approach can be applied to a very broad family of RMDs that affect people at almost any age. Application of JM to clinical questions in rheumatology may clarify why the course and the severity of symptoms of RMDs vary from patient to patient, and from time to time. In addition, these models provide a natural structure for dynamic individual predictions of longitudinal and time-to-event outcomes (

Importantly, JMs are also being increasingly used in clinical trials that are crucial to advancements in new drug therapies. In this setting, dropout is a common problem and raises concerns of non-ignorable missing data, in particular if a participant leaves the study due to an adverse reaction or a lack of effectiveness of the treatment. As mentioned above, ignoring the mechanism of missingness can cause bias in estimates in LMM. Perhaps most notably, in the JM framework, dropout time can be considered as a survival outcome, while a longitudinal sub-model can be used to obtain valid inferences with the correction for non-ignorable dropout. Several papers have suggested that JM of longitudinal data and time to dropout not only provide unbiased estimates (

JMs also have some important limitations. First, JMs are computationally intensive and time consuming which might pose logistical challenges for researchers working with large data sets. Second, as with any statistical modeling, LMM and CoxPH (the two sub-models of JM) are based on specific assumptions, which should be properly tested. This prerequisite step becomes more critical when these models are being used jointly and should not be ignored. Our aim in this manuscript is to provide an introduction to the JM approach that is accessible for a clinical audience not necessarily familiar with advanced topics in mixed effects modelling and time-to-event analysis; we emphasize that collaboration with statistical experts in these methods is important in applying JMs in practice.

In summary, the potential applications of JM in RMDs is underappreciated, though these methods provide clear advantages over traditional approaches (while incorporating strengths from these methods). Software is readily available to facilitate applications of JM to address relevant research and clinical questions in a statistically rigorous and coherent fashion. We hope to stimulate interest in these models among RMD researchers, with increased benefits to society through its use.

Graphical representation of features of the Joint Model

Footnote:

• The blue solid line in the top panel shows the survival function.

• The blue diamonds in the bottom panel are individual BMI measurements observed at the baseline and three follow-ups.

• The blue dashed line in the bottom panel corresponds to the approximation of BMI trajectory in CoxPH model. A value of BMI, observed and recorded only at a specific time, is assumed to remain constant between two visits and may be associated with the risk for event until the next visit.

• The solid red line represents the approximation of the longitudinal trajectory in JM, where the risk for an event is associated with the level of BMI and its change (red arrows)

Three Joint Models for longitudinal BMI and/or longitudinal change in BMI with risk for incident worsening rOA of the knee fitted to JoCo OA data: Comparison under different parameterizations.

JM1 | JM2 | JM3 | |
---|---|---|---|

(BIC = 2604.2) | (BIC = 2688.7) | (BIC = 2602.8) | |

Gender: male versus female | −0.08 (0.06) | −0.09 (0.06) | −0.09 (0.06) |

Age at baseline^{a} in years | |||

Log (BMI) | |||

slope of log (BMI) |

The numbers in the table represent the coefficients with standard errors from the time-to-event sub-model.

rOA, radiographic OA

BMI, concurrent value of Body Mass Index

In JM1, the survival process depends on the level of BMI at the same time point (concurrent level).

In JM2, the survival process depends on the slope of BMI at the same time point (concurrent slope).

In JM3, the survival process depends on the level of BMI and slope of BMI at the same time point.

BIC, Bayesian Information Criterion

Variable was standardized to have mean of 0 and standard deviation of 1

Three Joint Models for longitudinal BMI and/or longitudinal change in BMI with risk for incident worsening rOA of the knee fitted to JoCo OA data: Examples of clinical interpretation

HR for BMI | HR for BMI slope | |
---|---|---|

JM1 | 1.39 [1.31; 1.48] | |

JM2 | 4.59 [2.14; 9.86] | |

JM3 | 1.37 [1.29; 1.46] | 2.29 [1.20; 4.36] |

BMI, concurrent value of Body Mass Index (logarithmically transformed)

In JM1, the survival process depends on the level of BMI at the same time point (concurrent level).

In JM2, the survival process depends on the slope of BMI at the same time point (concurrent slope).

In JM3, the survival process depends on the level of BMI and slope of BMI at the same time point.

HR, hazard ratio

HR for a difference of 25% in BMI at the same time point for the same individual

HR for increase of 10% versus increase of 5% at the same time point for the same individual