sksurv.datasets.load_gbsg2#

sksurv.datasets.load_gbsg2(*, output_type='pandas')[source]#

Load and return the German Breast Cancer Study Group 2 dataset

The dataset has 686 samples and 8 features. The endpoint is recurrence free survival, which occurred for 299 patients (43.6%).

See [1], [2] for further description.

Parameters:

output_type ({"pandas", "polars"}, default="pandas") – Dataframe library used for the returned features.

Returns:

  • x (pandas.DataFrame or polars.DataFrame) – The measurements for each patient.

  • y (structured array with 2 fields) – cens: boolean indicating whether the endpoint has been reached or the event time is right-censored.

    time: total length of follow-up

References