Development and validation of a dynamic survival prediction model for patients with acute-on-chronic liver failure

Background & Aims Acute-on-chronic liver failure (ACLF) is usually associated with a precipitating event and results in the failure of other organ systems and high short-term mortality. Current prediction models fail to adequately estimate prognosis and need for liver transplantation (LT) in ACLF. This study develops and validates a dynamic prediction model for patients with ACLF that uses both longitudinal and survival data. Methods Adult patients on the UNOS waitlist for LT between 11.01.2016-31.12.2019 were included. Repeated model for end-stage liver disease-sodium (MELD-Na) measurements were jointly modelled with Cox survival analysis to develop the ACLF joint model (ACLF-JM). Model validation was carried out using separate testing data with area under curve (AUC) and prediction errors. An online ACLF-JM tool was created for clinical application. Results In total, 30,533 patients were included. ACLF grade 1 to 3 was present in 16.4%, 10.4% and 6.2% of patients, respectively. The ACLF-JM predicted survival significantly (p <0.001) better than the MELD-Na score, both at baseline and during follow-up. For 28- and 90-day predictions, ACLF-JM AUCs ranged between 0.840-0.871 and 0.833-875, respectively. Compared to MELD-Na, AUCs and prediction errors were improved by 23.1%-62.0% and 5%-37.6% respectively. Also, the ACLF-JM could have prioritized patients with relatively low MELD-Na scores but with a 4-fold higher rate of waiting list mortality. Conclusions The ACLF-JM dynamically predicts outcome based on current and past disease severity. Prediction performance is excellent over time, even in patients with ACLF-3. Therefore, the ACLF-JM could be used as a clinical tool in the evaluation of prognosis and treatment in patients with ACLF. Lay summary Acute-on-chronic liver failure (ACLF) progresses rapidly and often leads to death. Liver transplantation is used as a treatment and the sickest patients are treated first. In this study, we develop a model that predicts survival in ACLF and we show that the newly developed model performs better than the currently used model for ranking patients on the liver transplant waiting list.

To calculate ACLF-JM mortality predictions given individual patient data, please visit: https://predictionmodels.shinyapps.io/aclf-jm/ Hypothesis: In 30,533 adult LT candidates without ACLF (67%) or with ACLF-1 (16%), ACLF-2 (10%), or ACLF-3 (7%)  Introduction Liver transplantation (LT) is a lifesaving treatment for patients with acute-on-chronic-liver failure (ACLF). ACLF is characterized by an acute deterioration of liver function in patients with chronic liver disease, often started by a precipitating event. ACLF results in the failure of one or more organs and is associated with high short-term mortality. [1][2][3] The current model that prioritizes patients for LT, the model for end-stage liver disease-sodium (MELD-Na) score, 4,5 underestimates disease severity in ACLF. 6,7 This is because MELD-Na does not consider temporal development of single or multiorgan failure(s) (involving the 6 major organs/systemsi.e. liver, kidney, brain, coagulation, circulation, and respiration). This underestimation of predicted waitlist mortality results in lower access to transplantation for patients with ACLF. 7 Sundaram et al. showed that ACLF death and waiting list removal rate were highest in ACLF-3 patients with MELD-Na <25. 8 Given that 20.9% of UNOS LT candidates between 2005-2016 had a form of ACLF, 8 the overall impact of unequal transplantation access might be substantial. The MELD-Na score uses one moment in time, i.e. the most recent measurement, to predict outcome. 4,5 It therefore ignores previous data that could be valuable for survival estimation. However, ACLF is a dynamic disease with a clinical course that can change within days, resulting in very different outcomes. 9,10 Thus, there is a need for prediction models that estimate ACLF survival based on disease development over time. 7 The Chronic Liver Failure-Consortium organ failure (CLIF-C OF) and CLIF-C ACLF scores were developed for this purpose and showed better performance than the MELD-Na score. 3 . For 3 hypothetical patients A, B and C, the 20-day MELD-Na development is shown. After 20 days, patient A has a MELD-Na score of 30 and is thus prioritized by the current allocation system. However, the ACLF-JM uses both the estimated value (measured MELD-Na score) and slope (rate of change) at time=20 for survival prediction. Calculation of the HRs shows that the ACLF-JM gives patient C the greatest risk of death, because of the fast increase in MELD-Na scores (positive slope). See supplement 4 for the precise explanation and calculation. ACLF, acute-on-chronic liver failure; ACLF-JM, acute-on-chronic liver failure joint model; HR, hazard ratio; MELD-Na, model for end-stage liver disease-sodium. Research article and survival data. 11 It approximates changing disease severity over time and uses this for survival prediction. 12 JMs have shown superior predictive performance over Cox models. [12][13][14] However, they have not been applied to ACLF. We hypothesized that using disease development over time to dynamically predict prognosis could improve survival prediction in patients with ACLF. Much like a clinician, we aimed to use disease severity and its rate of change to predict outcome. We believe this is warranted in ACLF, because of the dynamic nature of the disease and the current underestimation of mortality by MELD-Na. 7,9,10 Therefore, we constructed and validated a multivariate prediction model for survival prediction in patients with ACLF: the ACLF-JM. We investigated the performance of ACLF-JM for 28-and 90-day survival prediction in the United Network for Organ Sharing (UNOS) registry and compared its performance to the MELD-Na score. We also investigated whether the ACLF-JM could identify patients in whom MELD-Na underestimates mortality. For easy clinical application, an online ACLF-JM tool was developed for dynamic survival prediction in patients with ACLF.

Materials and methods
The TRIPOD statement was used for the development and validation of this multivariate prediction model. 15

Study population
Data on LT candidates was requested from the UNOS. We included adult (> − 18 years) patients listed for a first LT between January 11, 2016 (after MELD-Na implementation) and December 31, 2019. We excluded candidates with acute liver failure and hepatocellular carcinoma at baseline. Data were used from first active listing until the earliest of patient death, transplantation, removal or censor at December 31, 2019. Death was defined both as death while listed and removal for being too sick to transplant. 8 If patients received exception points or a status 1 (i.e. high urgency status) after first listing, they were censored from that date. MELD-Na data was missing in 0.05%, therefore completecase analysis was done. Missing values for the predictors life support dependency (variable CAN_LIFE_SUPPORT, 0.00009% missing) and spontaneous bacterial peritonitis (CAN_BACTER-IA_PERIT, 0.005% missing) were set to 'no'.  (Table S1). Next, a Cox proportional hazards model was constructed for waiting list mortality, using the same predictors as the mixed-effect model. Then, the ACLF-JM was constructed by joint-modelling the longitudinal (mixed-effect) and survival (Cox) model. 17 A key feature is that the ACLF-JM uses both the estimated MELD-Na value and the rate of change in MELD-Na (the slope of the decrease/increase) over time for survival prediction. For clarity, these concepts of value and slope are illustrated in Fig. 1.

Identification of ACLF
Validation of the ACLF-JM Next, the prediction performance of the ACLF-JM was compared to the MELD-Na at various points in time in the separate testing data. Specifically, predictions were assessed at baseline and after a follow-up of 48 hours, 7 days and 14 days (similar to the validation study of the CLIF-C OF). 6 Outcomes were 28-day and 90-day survival. For both the ACLF-JM and MELD-Na Cox model, the area under the receiver-operating characteristic curve (AUC) and prediction errors were calculated and compared (see supplement 3 for detailed information). These measures and their 95% CIs and p values were calculated using the R package JM and bootstrapping. 17 ACLF-JM impact on the transplantation waiting list Next, we assessed the possible effect of using the ACLF-JM instead of MELD-Na to estimate mortality and subsequently prioritize patients for LT. This was of interest, because patients with ACLF are likely underserved in the current LT allocation. 7 To assess possible differences in MELD-Na and ACLF-JM waitlist prioritization of patients, we followed patients from baseline until day 28. 6 Within this period, each time a liver graft was offered, patients were ranked twice from most to least ill based on their estimated survival without transplant. One ranking was made with the ACLF-JM predictions and one based on MELD-Na.
Thus, for each model, patients were ranked 2,636 times, i.e. the total number of available liver grafts within the first 28 days. After a liver graft offer, the transplanted patient was removed from the waiting list. We assumed that the highest ranked patients were transplanted, which is not necessarily true, and thus that the number of available transplants in the first 28 days represented the threshold of receiving transplantation. We then assessed which patients were prioritized according to what model. After 28 days and 2,636 rankings, patients were stratified into 4 groups: those who are prioritized and possibly transplanted within 28 days according to both scores, those who are prioritized by either the ACLF-JM or MELD-Na score (but not by both) and those who are not prioritized by either. We also assessed the characteristics of the differently prioritized patients, to see why they were prioritized differently.

Clinical application of the ACLF-JM
Lastly, an online version of the ACLF-JM was created (https:// predictionmodels.shinyapps.io/aclf-jm/), which allows clinicians to assess ACLF-JM survival predictions for their individual patient(s). Plots can be created from these dynamic predictions, to show the updating survival estimate for every new available measurement during follow-up. For an instruction manual, see supplement 1 and 2. All statistical analyses were performed using R v4.0.0 (R Foundation for Statistical Computing, Vienna, Austria).

Study population
In total, we included 30,533 patients with 249,030 measurements. The ACLF-JM estimates the MELD-Na value and slope at a given timepoint and calculates the hazard ratio of death. For each MELD-Na point increase, the risk of death at 1 year increases by 15% (95% CI [14][15][16]. For every 1-point increase in slope, i.e. acceleration of disease increase, the mortality risk increases by 2% (95% CI 1-2). Of course, in clinical practice, disease severity often changes more rapidly, especially for patients with ACLF. A more intuitive illustration of the effect of MELD-Na value and slope is provided in Fig. 1, where 3 hypothetical cases are shown. The example calculation (details in supplement 4) shows that considering the rate of change (slope) in disease severity adds important information. Considering both MELD-Na value and slope would give priority to patient C (MELD-Na score 20, accelerating disease severity), whereas using the current MELD-Na-based allocation would prioritize patient A (MELD-Na 30, stable disease).

Model validation
The ACLF-JM prediction performance was validated in separate testing data. Table 2 shows the 28-and 90-day prediction performance of the ACLF-JM and MELD-Na, stratified for patients with and without ACLF, at baseline and during follow-up. For all time points and studied outcomes, the JM performance was significantly better than MELD-Na. At baseline in patients with ACLF, the ACLF-JM AUC was 0.875 (95% CI 0.840-0.909) and MELD-Na AUC was 0.780 (95% CI 0.737-0.823). During follow-up, AUCs of both models declined to 0.833 (0.799-0.868) and 0.719 (0.677-0.761) respectively, which is still excellent for the ACLF-  JM and respectable for the MELD-Na (also see Fig. S2A and S3). Fig. 2 shows that with increasing ACLF grade, JM performance remains significantly better than the declining MELD-Na (also see Table S3 and Fig. S3). The performance of the ACLF-JM was particularly good for 90-day mortality prediction in patients with ACLF grade 3, with AUCs ranging from 0.841 to 0.853, contrasting with the MELD-Na AUCs of between 0.613 and 0.693. AUCs for MELD-Na were (almost) equal when predicting 28-day mortality in patients with ACLF-3, ranging from 0.497 to 0.605. Importantly, the ACLF-JM also better estimated risks, i.e. is better calibrated, than the MELD-Na (Fig. S2B). With increasing ACLF grade, prediction errors were improved up to 37.6% (Fig. S3B). An accurate model is important for clinical decision making, because decisions are often based on risks. 18

ACLF-JM impact on the transplantation waiting list
To study the difference in survival prediction and subsequent allocation priority between the ACLF-JM and the MELD-Na, patients were followed-up for the first 28 days. In total, 2,636 transplants were performed within this period. Fig. 3 shows the correlation plot between MELD-Na scores and ACLF-JM mortality estimates after 28 days of waiting list follow-up. For 2,186 patients (in green), transplantation priority was given according to both the ACLF-JM and MELD-Na, as estimated mortality without LT was highest. More interestingly, 450 patients (in blue) could possibly have been prioritized by the ACLF-JM, but not by MELD-Na. Importantly, although these patients had lower median MELD-Na scores, they also had 4-fold higher 28-day mortality rates, i.e. 13.1% vs. 3.1% (Table 3). Compared to the 450 MELD-Naprioritized patients (orange), ACLF-JM-prioritized patients were older, more often female, had lower ACLF-1 rates, more NASH, less alcohol-induced liver disease and were more often dependent on life support. After 28 days, 190 patients were delisted due to increased disease severity. In these patients, the survival prediction AUCs (95% CI) for the ACLF-JM and MELD-Na score were 88.0 (85.1-90.9) and 82.5 (79.0-85.9), respectively (Fig. S6).
Clinical application of the ACLF-JM After constructing and validating the ACLF-JM in this large cohort, an online application was developed, which allows clinicians to easily calculate individual patient survival probabilities based on the ACLF-JM. Available at: https://predictionmodels. shinyapps.io/aclf-jm/. Excel files with repeated MELD-Na measurements can be uploaded into this tool, to generate dynamic survival predictions during follow-up. The ACLF-JM simulates individual patient data to calculate personalized predictions. See supplement 1 for precise instructions for the data upload and supplement 2 for a step-by-step manual.

Discussion
In this study, we developed and validated the ACLF-JM prediction model, to estimate survival of patients with ACLF. We report several important findings. First, both current and past disease severity and its rate of change are strongly associated with survival in ACLF. Second, by using these data, the ACLF-JM gives excellent prediction performance, even in ACLF-3, and significantly outperforms MELD-Na. Third, the ACLF-JM could have prioritized patients with low median MELD-Na scores, i.e., not identified by MELD-Na, but with 4fold higher mortality rates than MELD-Na-prioritized patients. Fourth, the ACLF-JM can be clinically applied online to estimate and visualize patient-specific survival, which can be updated with every new measurement. ACLF disease severity is dynamic and can change rapidly. During the first week, disease severity changes for most patients, resulting in different survival outcomes. 9,10 The current liver allocation system does not consider change, as it uses only the most recent measurement for survival prediction and ignores previous data. Moreover, survival is estimated based on the MELD-Na score, which ignores relevant factors for ACLF and therefore underestimates mortality. 7,8 Hernaez et al. showed that mortality was higher than expected in low MELD-Na score patients. They also showed that, despite their high(er) ACLF grade, these low MELD-Na patients were often not considered for LT. 7 Interestingly, Hernaez et al. stated that "Future research should also focus on developing and validating prognostic scores that incorporate dynamic changes in patients clinical course", i.e. the goal of this study. Sundaram et al. showed that ACLF death and removal rate did not correlate well with the MELD-Na score, as mortality rates were highest in ACLF-3 patients with MELD-Na <25. 8 In this study, ACLF was present in 33.3% (Table 1) of the patients. As a result, the MELD-Na underestimation of ACLF disease severity could be substantial, which possibly leads to unequal treatment access and surplus mortality. 7 Therefore, the ACLF-JM was developed to predict ACLF patient survival based on disease development over time. The model provides several important improvements over the MELD-Na score (Table S4). 19 Most importantly, predictions are based on all available previous data and update for every new measurement. 20 Predictions should update based on accumulating evidence, because ACLF is a dynamic disease. The ACLF-JM can handle varying measurements per patient and varying time between measurements, which is likely for waiting list data over time. At minimum, 1 measurement is required per patient to give a survival prediction. With more available measurements over time, increasingly accurate estimates can be made. The ACLF-JM also considers both the value of disease severity and the rate at which disease severity is changing (Fig. 1). It uses more nuanced aspects of ACLF disease development to predict survival. Thus, like a clinician, past and current disease developments are used to estimate patient prognosis. Updating prognosis is important in ACLF, as disease can increase fast and non-linearly (e.g. exponentially). 1,3 ACLF-JM survival predictions could therefore be used to aid clinical decision making for patients with ACLF on the waiting list for LT, as current models result in unequal transplantation access and post-LT survival rates. 8,10,16 Furthermore, In this cohort, we showed that ACLF-JM prioritization identified patients with low MELD-Na scores, but high mortality (Table 3). Mortality is underestimated in these patients and subsequently they receive a lower priority for LT. Since patients with ACLF benefit from fast LT, 16 use of the ACLF-JM for the evaluation of prognosis could perhaps help to resolve the underestimation of waiting list mortality in patients with ACLF. 7 The ACLF-JM showed excellent performance for the prediction of short-term survival at baseline and with increasing follow-up. Increasing ACLF grade did not lead to a decrease in predictive accuracy. This is important, because risk of death and need for LT should be reliably estimated in the sickest patients. Our data showed that both the ACLF-JM and MELD-Na AUCs declined with increasing follow-up. This is likely due to population changes, i.e. the sickest patients die or are transplanted first and less patients remain with increasing follow-up. 21 Also, with increasing disease severity, generally a shorter follow-up period is available. The ACLF-JM approximation of disease does not depend on the number of measurements per patient, because it estimates disease over time as a continuous trajectory (Fig. S4). This is important, because frequency of measurement confounded previous (Cox-based) survival predictions for patients in need of LT. 22 The ACLF-JM performed comparably or sometimes even better than the CLIF-COF score. 6 This could possibly indicate that ACLF-JM performance was adequate for clinical application. Because the UNOS registry does not contain data on white blood cell counts, CLIF-C ACLF scoring was not possible in this study. ACLF-JM performance could however be externally validated in the cohorts used to construct the CLIF-C scores. 6 The differences in waiting list prioritization between the ACLF-JM and MELD-Na were investigated for the first 28 days. 6 The results of this prioritization naturally depend on the chosen time period and we did not represent the complex reality of liver allocation. However, the goal was to illustrate how the ACLF-JM prioritized differently from the MELD-Na, because of its inherent use of disease development and rate of change over time. After training and ascertaining excellent performance, an online ACLF-JM tool was created for clinical use. Especially in ACLF, both the patient and treating clinician benefit from patient-specific modelling, which shifts the focus of prediction from the population to the individual patient level. Jalan et al. already stated that there is a need for models that "update on a daily basis providing additional prognostic information", and that "currently, no validated evidence-based tools guide the decision-making". 6 The ACLF-JM meets these demands and more, with excellent performance leading to personalized prediction, readily available online for any clinician.
A limitation is that longitudinal MELD-Na measurements are not best to model ACLF disease development, as they can underestimate ACLF disease severity. 7 Ideally, longitudinal CLIF-C ACLF score data would be available in the UNOS registry, but currently missing leucocyte counts prevent CLIF-C ACLF scoring. Further information on lactate levels and bacterial infection would be valuable to register for LT candidates. 23 The MELD-Na was one of the few consistently available longitudinal measurements, which allowed for analysis on a large scale and comparison to previous studies. The retrospective analysis of large databases also has several disadvantages. Misclassification of disease severity could introduce bias, e.g. subjective scoring of ascites and encephalopathy. Also, surrogate markers, suggested by authors of other large UNOS ACLF analyses, were used for ventilatory and circulatory failure. 6,8,10,16 For example, mechanical ventilation was used as replacement for respiratory failure, it is however very well possible that a patient with respiratory failure did not receive mechanical ventilation, or vice versa. Despite these shortcomings, the ACLF-JM showed excellent performance with increasing disease severity (ACLF grade).
ACLF survival is dynamically predicted by the ACLF-JM prediction model, using both longitudinal and survival data. Updating prognosis on new measurements is important, as ACLF is a dynamic disease. The ACLF-JM prediction performance was excellent in this cohort, even in patients with ACLF-3. The ACLF-JM could therefore be used as a tool for the personalized evaluation of prognosis and clinical decision making in patients with ACLF.
Abbreviations ACLF, acute-on-chronic liver failure; ACLF-JM, acute-on-chronic liver failure joint model; CLIF-C, Chronic Liver Failure-Consortium; INR, international normalized Ratio for the prothrombin time; LT, liver transplantation; MELD-Na, model for end-stage liver disease-sodium; UNOS, United Network for Organ Sharing.

Financial support
The manuscript was not prepared by or funded in any part by a commercial organization. No financial support or grants were used for the preparation of this manuscript.