sksurv.metrics.cumulative_dynamic_auc

sksurv.metrics.cumulative_dynamic_auc(survival_train, survival_test, estimate, times, tied_tol=1e-08)

Estimator of cumulative/dynamic AUC for right-censored time-to-event data.

The receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) can be extended to survival data by defining sensitivity (true positive rate) and specificity (true negative rate) as time-dependent measures. Cumulative cases are all individuals that experienced an event prior to or at time \(t\) (\(t_i \leq t\)), whereas dynamic controls are those with \(t_i > t\). The associated cumulative/dynamic AUC quantifies how well a model can distinguish subjects who fail by a given time (\(t_i \leq t\)) from subjects who fail after this time (\(t_i > t\)).

Given an estimator of the \(i\)-th individual’s risk score \(\hat{f}(\mathbf{x}_i)\), the cumulative/dynamic AUC at time \(t\) is defined as

\[\widehat{\mathrm{AUC}}(t) = \frac{\sum_{i=1}^n \sum_{j=1}^n I(y_j > t) I(y_i \leq t) \omega_i I(\hat{f}(\mathbf{x}_j) \leq \hat{f}(\mathbf{x}_i))} {(\sum_{i=1}^n I(y_i > t)) (\sum_{i=1}^n I(y_i \leq t) \omega_i)}\]

where \(\omega_i\) are inverse probability of censoring weights (IPCW).

To estimate IPCW, access to survival times from the training data is required to estimate the censoring distribution. Note that this requires that survival times survival_test lie within the range of survival times survival_train. This can be achieved by specifying times accordingly, e.g. by setting times[-1] slightly below the maximum expected follow-up time. IPCW are computed using the Kaplan-Meier estimator, which is restricted to situations where the random censoring assumption holds and censoring is independent of the features.

The function also provides a single summary measure that refers to the mean of the \(\mathrm{AUC}(t)\) over the time range \((\tau_1, \tau_2)\).

\[\overline{\mathrm{AUC}}(\tau_1, \tau_2) = \frac{1}{\hat{S}(\tau_1) - \hat{S}(\tau_2)} \int_{\tau_1}^{\tau_2} \widehat{\mathrm{AUC}}(t)\,d \hat{S}(t)\]

where \(\hat{S}(t)\) is the Kaplan–Meier estimator of the survival function.

Parameters:
survival_train : structured array, shape = (n_train_samples,)

Survival times for training data to estimate the censoring distribution from. A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

survival_test : structured array, shape = (n_samples,)

Survival times of test data. A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

estimate : array-like, shape = (n_samples,)

Estimated risk of experiencing an event of test data.

times : array-like, shape = (n_times,)

The time points for which the area under the time-dependent ROC curve is computed. Values must be within the range of follow-up times of the test data survival_test.

tied_tol : float, optional, default: 1e-8

The tolerance value for considering ties. If the absolute difference between risk scores is smaller or equal than tied_tol, risk scores are considered tied.

Returns:
auc : array, shape = (n_times,)

The cumulative/dynamic AUC estimates (evaluated at times).

mean_auc : float

Summary measure referring to the mean cumulative/dynamic AUC over the specified time range (times[0], times[-1]).

References

[1]H. Uno, T. Cai, L. Tian, and L. J. Wei, “Evaluating prediction rules for t-year survivors with censored regression models,” Journal of the American Statistical Association, vol. 102, pp. 527–537, 2007.
[2]H. Hung and C. T. Chiang, “Estimation methods for time-dependent AUC models with survival data,” Canadian Journal of Statistics, vol. 38, no. 1, pp. 8–26, 2010.
[3]J. Lambert and S. Chevret, “Summary measure of discrimination in survival models based on cumulative/dynamic time-dependent ROC curves,” Statistical Methods in Medical Research, 2014.