HyperAIHyperAI

Command Palette

Search for a command to run...

il y a 3 ans

Un tutoriel pour évaluer l'adéquation des modèles de guérison

Geethanjalee Mudunkotuwa Durbadal Ghosh Subodh Selukar

Évaluation des modèles de diffusion

20 heures de calcul sur RTX 5090 pour seulement $1 (valeur $7)
Aller à Notebook

Résumé

Dans l’analyse de survie, les modèles traditionnels supposent que tous les individus finiront par subir l’événement d’intérêt. Toutefois, les progrès thérapeutiques ont conduit à de multiples contextes cliniques comportant des thérapies potentiellement curatives, et dans ces contextes, certains individus peuvent ne jamais subir l’événement. Les statisticiens ont développé des modèles de guérison (cure models) comme méthodologie pour répondre à ce défi. Néanmoins, malgré des avancées statistiques significatives dans le domaine des modèles de guérison, leur adoption dans les applications biomédicales est restée limitée, et nous émettons l’hypothèse que cela est dû à un manque de directives quant à l’application appropriée de ces modèles. Les modèles de guérison exigent des conditions d’identifiabilité spécifiques pour une estimation valide des paramètres, et des travaux antérieurs ont mis en évidence des problèmes importants liés à une application inappropriée de ces modèles. Les tutoriels existants sur les modèles de guérison se concentrent sur la mise en œuvre du modèle et supposent ou fournissent uniquement des directives limitées quant à l’opportunité d’utiliser un modèle de guérison pour l’ensemble de données considéré. Ce tutoriel comble cette lacune en décrivant une procédure systématique qui intègre le jugement clinique, l’inspection visuelle des courbes de Kaplan-Meier et une évaluation quantitative.

One-sentence Summary

This tutorial addresses the limited biomedical adoption of cure models in survival analysis by presenting a systematic evaluation procedure that integrates clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative evaluation to satisfy identifiability conditions and ensure valid parameter estimation, distinguishing the approach from prior implementation-focused guides that frequently enable model misapplication.

Key Contributions

  • This tutorial presents a systematic evaluation framework to determine the statistical and clinical appropriateness of cure models for survival analyses involving potentially curative therapies and long-term survivorship.
  • The proposed procedure integrates expert clinical judgment, visual inspection of non-zero Kaplan-Meier curve plateaus, and formal quantitative testing to verify identifiability conditions prior to parameter estimation.
  • Empirical analyses demonstrate that cure model suitability depends on cohort risk profiles rather than follow-up duration alone, while also highlighting data quality degradation when clinical monitoring transitions from active trials to passive follow-up.

Introduction

Modern therapeutic advances have enabled long-term remission in several oncology populations, prompting statisticians to develop cure models that estimate a non-zero fraction of patients who will never experience the event of interest. These models improve long-term survival extrapolations and clinical trial planning, making them essential for accurate biomedical decision-making. However, cure models require strict identifiability conditions, particularly adequate follow-up duration to distinguish cured individuals from those still at risk. Prior methodological tutorials have focused heavily on model implementation while neglecting how to verify whether a dataset actually meets these prerequisites, which frequently leads to misapplication and unreliable estimates. The authors address this gap by introducing a systematic evaluation framework that integrates clinical judgment, visual inspection of Kaplan-Meier curves, and quantitative statistical checks. By validating this workflow with acute myeloid leukemia and hematopoietic cell transplantation data, they provide researchers with a practical guide to ensure cure models are only applied when biologically and statistically justified.

Dataset

Dataset Composition and Sources: The authors combine clinical trial data from the SWOG S1203 study with prospective and retrospective datasets from the Bone Marrow Transplantation & Cellular Therapy program at St. Jude Children's Research Hospital to evaluate cure model applicability in hematologic oncology.

Subset Details:

  • S1203 AML Trial: Focuses on the IA treatment arm for adult patients with previously untreated acute myeloid leukemia. Event-free survival is tracked from randomization, with a maximum follow-up exceeding 7.3 years.
  • St. Jude Prospective Cohorts: HAPNK1 includes 53 patients in complete remission and 19 with active disease from a phase 2 high-risk malignancy study. HAP2HCT includes 48 patients from phase 2 dose levels 3 to 4 of a phase 1/2 study.
  • St. Jude Retrospective Cohorts: HCTRETRO covers 106 patients receiving a second transplant and 13 receiving more than two. The Refractory at HCT cohort includes 129 patients with active disease at the time of transplant.

Data Usage and Modeling Approach: The authors use the cohorts to fit and evaluate cure survival models rather than for traditional machine learning training splits or mixture ratios. They apply a standardized assessment framework to test model appropriateness across diverse patient prognoses and follow-up protocols, calculating maximum follow-up duration in years and ranking model performance using Akaike Information Criterion values and RECeUS assessments.

Processing and Metadata Construction: Time-to-event metrics are standardized from the randomization date, and patients without observed events are right censored at their last contact. The authors compile cohort-level metadata that records sample sizes, median survival times, maximum follow-up in years, Kaplan-Meier estimates at the longest observation point, visual cure indicators, and the best-fitting model specification for each subgroup.

Method

The authors leverage a structured three-stage framework to assess the appropriateness of cure models for survival data. This process begins with clinical and biological evaluation, where expert judgment determines whether a cure model is biologically plausible and whether long-term survival without recurrence is expected. If both conditions are satisfied, the analysis proceeds to visual assessment using the Kaplan-Meier survival curve. This stage examines whether the survival curve exhibits a horizontal plateau at a level above zero, indicating a non-zero cure fraction, and whether late events are absent or rare, suggesting that susceptible individuals have largely experienced the event.

As shown in the figure below, if the visual evidence supports a cure model, the process advances to a quantitative assessment. This final stage evaluates whether the data provide strong evidence for both a sufficient follow-up duration and a non-negligible cure fraction. The assessment involves fitting both a cure model and a standard non-cure model using the same parametric distribution, such as Weibull, and comparing them via Akaike Information Criterion (AIC). A cure model is selected if it yields a lower AIC. Subsequently, the estimated cure fraction θ\thetaθ and the ratio r^r_{\hat{}}r^, which reflects the proportion of uncured individuals still at risk at the maximum follow-up time, are computed. The model is deemed appropriate only if θ>0.025\theta > 0.025θ>0.025 and r^<0.05r_{\hat{}} < 0.05r^<0.05, ensuring both a clinically meaningful cure fraction and sufficient follow-up to observe the tail of the survival distribution. This framework integrates clinical insight, graphical evidence, and statistical inference to justify the use of cure models.

Experiment

The evaluation setup integrates visual inspection of Kaplan-Meier survival curves with quantitative hypothesis testing to assess follow-up adequacy and cure model appropriateness. Visual analysis demonstrates that extended observation periods produce distinct survival plateaus indicative of long-term remission, while restricted timelines obscure this pattern and compromise model validity. Quantitative frameworks, including two-step testing and the RECeUS method, consistently corroborate these visual findings by confirming that prolonged monitoring reliably supports cure modeling. Ultimately, the experiments conclude that retrospective datasets with sufficient follow-up duration can confidently justify cure model application, even when formal protocol-specified monitoring is absent.

The authors evaluate the appropriateness of cure models using a combination of visual and quantitative methods, including model selection via AIC and the RECeUS framework. Results show that a cure model is supported for certain cohorts based on sufficient follow-up and model fit, while other cases are deemed inappropriate due to insufficient data or model selection outcomes. The RECeUS method confirms model appropriateness when both the estimated cure fraction is above a threshold and the ratio of susceptible survival to population survival is low. A cure model is supported when both the estimated cure fraction is above a threshold and the ratio of susceptible survival to population survival is low. Model selection via AIC identifies the Weibull cure model as the best fit for the IA arm, supporting the use of a cure model. Visual inspection of Kaplan-Meier curves shows a long plateau, indicating sufficient follow-up and supporting the plausibility of a cure model.

The authors compare parametric cure and non-cure models using AIC to assess the appropriateness of a cure model for the IA arm. The Weibull cure model has the lowest AIC value, indicating it is the best-fitting model among those considered. Based on this selection, further analysis is conducted to evaluate follow-up adequacy and cure fraction using the RECeUS method. The Weibull cure model has the lowest AIC value among all models, suggesting it is the best fit for the data. AIC comparison favors the Weibull cure model over all non-cure and other cure models. The RECeUS method confirms the appropriateness of a cure model based on model selection and follow-up adequacy criteria.

The authors evaluate the appropriateness of cure models for different datasets using a combination of visual and quantitative methods. Results show that datasets with longer follow-up times and evidence of a survival plateau are more likely to support the use of cure models, as indicated by both visual inspection and the RECeUS method. The RECeUS method, which integrates cure fraction and follow-up sufficiency, provides a consistent assessment of model appropriateness across datasets. Datasets with longer follow-up times and observed survival plateaus are more likely to support cure model application. The RECeUS method provides a consistent assessment of cure model appropriateness by combining cure fraction and follow-up sufficiency. Visual evidence of a survival plateau is a key indicator for the suitability of cure models, particularly when combined with quantitative results.

The evaluation combines visual inspection of survival curves with quantitative model selection and follow-up adequacy assessments to determine the suitability of cure models across different datasets. Experiments validate that cure models are appropriate primarily when datasets demonstrate sufficient follow-up periods, clear survival plateaus, a high estimated cure fraction, and a low ratio of susceptible to population survival. Quantitative comparisons consistently identify the Weibull cure model as the optimal fit for the IA arm, while the RECeUS framework reliably confirms model appropriateness by integrating cure fraction thresholds with follow-up sufficiency. Overall, the findings establish that adequate longitudinal data and distinct survival plateaus are critical prerequisites for successfully applying cure models in survival analysis.


Créer de l'IA avec l'IA

De l'idée au lancement — accélérez votre développement IA avec le co-codage IA gratuit, un environnement prêt à l'emploi et le meilleur prix pour les GPU.

Codage assisté par IA
GPU prêts à l’emploi
Tarifs les plus avantageux

HyperAI Newsletters

Abonnez-vous à nos dernières mises à jour
Nous vous enverrons les dernières mises à jour de la semaine dans votre boîte de réception à neuf heures chaque lundi matin
Propulsé par MailChimp