sksurv.datasets.load_breast_cancer#

sksurv.datasets.load_breast_cancer()[source]#

Load and return the breast cancer dataset

The dataset has 198 samples and 80 features. The endpoint is the presence of distance metastases, which occurred for 51 patients (25.8%).

See [1], [2] for further description.

Returns:

  • x (pandas.DataFrame) – The measurements for each patient.

  • y (structured array with 2 fields) – e.tdm: boolean indicating whether the endpoint has been reached or the event time is right censored.

    t.tdm: time to distant metastasis (days)

References