sksurv.metrics.concordance_index_ipcw#

sksurv.metrics.concordance_index_ipcw(survival_train, survival_test, estimate, tau=None, tied_tol=1e-08)[source]#

Concordance index for right-censored data based on inverse probability of censoring weights.

This is an alternative to the estimator in concordance_index_censored() that does not depend on the distribution of censoring times in the test data. By using inverse probability of censoring weights (IPCW), it provides an unbiased and consistent estimate of the population concordance measure.

This estimator requires access to survival times from the training data to estimate the censoring distribution. Note that survival times in survival_test must lie within the range of survival times in survival_train. This can be achieved by specifying the truncation time tau. The resulting cindex tells how well the given prediction model works in predicting events that occur in the time range from 0 to tau.

For time points in survival_test that lie outside of the range specified by values in survival_train, the probability of censoring is unknown and an exception will be raised:

ValueError: time must be smaller than largest observed time point

The censoring distribution is estimated using the Kaplan-Meier estimator, which assumes that censoring is random and independent of the features.

See the User Guide and [1] for further description.

Parameters:

survival_train (structured array, shape = (n_train_samples,)) – Survival times for the training data, used to estimate the censoring distribution. A structured array with two fields. The first field is a boolean where True indicates an event and False indicates right-censoring. The second field is a float with the time of event or time of censoring.
survival_test (structured array, shape = (n_samples,)) – Survival times for the test data. A structured array with two fields. The first field is a boolean where True indicates an event and False indicates right-censoring. The second field is a float with the time of event or time of censoring.
estimate (array-like, shape = (n_samples,)) – Predicted risk scores for the test data (e.g., from estimator.predict(X)). A higher value indicates a higher risk of experiencing an event.
tau (float, optional) – Truncation time. The survival function for the underlying censoring time distribution \(D\) needs to be positive at tau, i.e., tau should be chosen such that the probability of being censored after time tau is non-zero: \(P(D > \tau) > 0\). If None, no truncation is performed.
tied_tol (float, optional, default: 1e-8) – The tolerance value for considering ties in risk scores. If the absolute difference between two risk scores is smaller than or equal to tied_tol, they are considered tied.

Returns:

cindex (float) – The concordance index.
concordant (int) – The number of concordant pairs.
discordant (int) – The number of discordant pairs.
tied_risk (int) – The number of pairs with tied risk scores.
tied_time (int) – The number of comparable pairs with tied survival times.

Notes

This metric expects risk scores, which are typically returned by estimator.predict(X). It does not accept survival probabilities.