sksurv.metrics.cumulative_dynamic_auc#

sksurv.metrics.cumulative_dynamic_auc(survival_train, survival_test, estimate, times, tied_tol=1e-08)[source]#

Computes the cumulative/dynamic area under the ROC curve (AUC) for right-censored data.

This metric evaluates a model’s performance at specific time points. The cumulative/dynamic AUC at time \(t\) quantifies how well a model can distinguish subjects who experience an event by time \(t\) (cases) from those who do not (controls). A higher AUC indicates better model performance.

This function can also evaluate models with time-dependent predictions, such as sksurv.ensemble.RandomSurvivalForest (see User Guide). In this case, estimate must be a 2D array where estimate[i, j] is the predicted risk score for the \(i\)-th instance at time point times[j].

The receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) are metrics to evaluate a binary classifier. Each point on the ROC denotes the performance of a binary classifier at a specific threshold with respect to the sensitivity (true positive rate) on the y-axis and the specificity (true negative rate) on the x-axis.

ROC and AUC can be extended to survival analysis by defining cases and controls based on a time point \(t\). Cumulative cases are all individuals that experienced an event prior to or at time \(t\) (\(t_i \leq t\)), whereas dynamic controls are those with \(t_i > t\). Given an estimator of the \(i\)-th individual’s risk score \(\hat{f}(\mathbf{x}_i)\), the cumulative/dynamic AUC at time \(t\) is defined as

\[\widehat{\mathrm{AUC}}(t) = \frac{\sum_{i=1}^n \sum_{j=1}^n I(y_j > t) I(y_i \leq t) \omega_i I(\hat{f}(\mathbf{x}_j) \leq \hat{f}(\mathbf{x}_i))} {(\sum_{i=1}^n I(y_i > t)) (\sum_{i=1}^n I(y_i \leq t) \omega_i)}\]

where \(\omega_i\) are inverse probability of censoring weights (IPCW).

To account for censoring, this metric uses inverse probability of censoring weights (IPCW), which requires access to survival times from the training data to estimate the censoring distribution. Note that survival times in survival_test must lie within the range of survival times in survival_train. This can be achieved by specifying times accordingly, e.g. by setting times[-1] slightly below the maximum expected follow-up time.

For time points in survival_test that lie outside of the range specified by values in survival_train, the probability of censoring is unknown and an exception will be raised:

ValueError: time must be smaller than largest observed time point

The censoring distribution is estimated using the Kaplan-Meier estimator, which assumes that censoring is random and independent of the features.

The function also returns a summary measure, which is the mean of the \(\mathrm{AUC}(t)\) over the specified time range, weighted by the estimated survival function:

\[\overline{\mathrm{AUC}}(\tau_1, \tau_2) = \frac{1}{\hat{S}(\tau_1) - \hat{S}(\tau_2)} \int_{\tau_1}^{\tau_2} \widehat{\mathrm{AUC}}(t)\,d \hat{S}(t)\]

where \(\hat{S}(t)\) is the Kaplan–Meier estimator of the survival function.

See the User Guide, [1], [2], [3] for further description.

Parameters:
  • survival_train (structured array, shape = (n_train_samples,)) – Survival times for the training data, used to estimate the censoring distribution. A structured array with two fields. The first field is a boolean where True indicates an event and False indicates right-censoring. The second field is a float with the time of event or time of censoring.

  • survival_test (structured array, shape = (n_samples,)) – Survival times for the test data. A structured array with two fields. The first field is a boolean where True indicates an event and False indicates right-censoring. The second field is a float with the time of event or time of censoring.

  • estimate (array-like, shape = (n_samples,) or (n_samples, n_times)) – Predicted risk scores for the test data (e.g., from estimator.predict(X). A higher value indicates a higher risk of experiencing an event. If a 1D array is provided, the same risk score is used for all time points. If a 2D array is provided, estimate[:, j] is used for the \(j\)-th time point.

  • times (array-like, shape = (n_times,)) – The time points at which to compute the AUC. Values must be within the range of follow-up times in survival_test.

  • tied_tol (float, optional, default: 1e-8) – The tolerance value for considering ties in risk scores. If the absolute difference between two risk scores is smaller than or equal to tied_tol, they are considered tied.

Returns:

  • auc (ndarray, shape = (n_times,)) – The cumulative/dynamic AUC estimates at each time point in times.

  • mean_auc (float) – The mean cumulative/dynamic AUC over the specified time range (times[0], times[-1]).

Notes

This metric expects risk scores, which are typically returned by estimator.predict(X) (for time-independent risks), or estimator.predict_cumulative_hazard_function(X) (for time-dependent risks). It does not accept survival probabilities.

See also

as_cumulative_dynamic_auc_scorer

A wrapper class that uses cumulative_dynamic_auc() in its score method instead of the default concordance_index_censored().

References