sksurv.metrics.brier_score#
- sksurv.metrics.brier_score(survival_train, survival_test, estimate, times)[source]#
The time-dependent Brier score for right-censored data.
The time-dependent Brier score measures the inaccuracy of predicted survival probabilities at a given time point. It is the mean squared error between the true survival status and the predicted survival probability at time point \(t\). A lower Brier score indicates better model performance.
To account for censoring, this metric uses inverse probability of censoring weights (IPCW), which requires access to survival times from the training data to estimate the censoring distribution. Note that survival times in
survival_testmust lie within the range of survival times insurvival_train. This can be achieved by specifyingtimesaccordingly, e.g. by settingtimes[-1]slightly below the maximum expected follow-up time.For time points in
survival_testthat lie outside of the range specified by values insurvival_train, the probability of censoring is unknown and an exception will be raised:ValueError: time must be smaller than largest observed time point
The censoring distribution is estimated using the Kaplan-Meier estimator, which assumes that censoring is random and independent of the features.
The time-dependent Brier score at time \(t\) is defined as
\[\mathrm{BS}^c(t) = \frac{1}{n} \sum_{i=1}^n I(y_i \leq t \land \delta_i = 1) \frac{(0 - \hat{\pi}(t | \mathbf{x}_i))^2}{\hat{G}(y_i)} + I(y_i > t) \frac{(1 - \hat{\pi}(t | \mathbf{x}_i))^2}{\hat{G}(t)} ,\]where \(\hat{\pi}(t | \mathbf{x})\) is the predicted survival probability up to the time point \(t\) for a feature vector \(\mathbf{x}\), and \(1/\hat{G}(t)\) is a inverse probability of censoring weight.
See the User Guide and [1] for details.
- Parameters:
survival_train (structured array, shape = (n_train_samples,)) – Survival times for the training data, used to estimate the censoring distribution. A structured array with two fields. The first field is a boolean where
Trueindicates an event andFalseindicates right-censoring. The second field is a float with the time of event or time of censoring.survival_test (structured array, shape = (n_samples,)) – Survival times for the test data. A structured array with two fields. The first field is a boolean where
Trueindicates an event andFalseindicates right-censoring. The second field is a float with the time of event or time of censoring.estimate (array-like, shape = (n_samples, n_times)) – Predicted survival probabilities for the test data at the time points specified by
times, typically obtained fromestimator.predict_survival_function(X). The value ofestimate[:, i]must correspond to the estimated survival probability up to the time pointtimes[i].times (array-like, shape = (n_times,)) – The time points at which to compute the Brier score. Values must be within the range of follow-up times in
survival_test.
- Returns:
times (ndarray, shape = (n_times,)) – The unique time points at which the Brier score was estimated.
brier_scores (ndarray, shape = (n_times,)) – The Brier score at each time point in
times.
Notes
This metric expects survival probabilities, which are typically returned by
estimator.predict_survival_function(X). It does not accept risk scores.Examples
>>> from sksurv.datasets import load_gbsg2 >>> from sksurv.linear_model import CoxPHSurvivalAnalysis >>> from sksurv.metrics import brier_score >>> from sksurv.preprocessing import OneHotEncoder
Load and prepare data.
>>> X, y = load_gbsg2() >>> X["tgrade"] = X.loc[:, "tgrade"].map(len).astype(int) >>> Xt = OneHotEncoder().fit_transform(X)
Fit a Cox model.
>>> est = CoxPHSurvivalAnalysis(ties="efron").fit(Xt, y)
Retrieve individual survival functions and get probability of remaining event free up to 5 years (=1825 days).
>>> survs = est.predict_survival_function(Xt) >>> preds = [fn(1825) for fn in survs]
Compute the Brier score at 5 years.
>>> times, score = brier_score(y, y, preds, 1825) >>> print(score) [0.20881843]
See also
integrated_brier_scoreComputes the average Brier score over all time points.
References