sksurv.datasets.load_breast_cancer¶
-
sksurv.datasets.
load_breast_cancer
()[source]¶ Load and return the breast cancer dataset
The dataset has 198 samples and 80 features. The endpoint is the presence of distance metastases, which occurred for 51 patients (25.8%).
See [1], [2] for further description.
Returns: - x (pandas.DataFrame) – The measurements for each patient.
- y (structured array with 2 fields) – e.tdm: boolean indicating whether the endpoint has been reached
or the event time is right censored.
t.tdm: time to distant metastasis (days)
References
[1] https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7390 [2] Desmedt, C., Piette, F., Loi et al.: “Strong Time Dependence of the 76-Gene Prognostic Signature for Node-Negative Breast Cancer Patients in the TRANSBIG Multicenter Independent Validation Series.” Clin. Cancer Res. 13(11), 3207–14 (2007)