sksurv.datasets.load_whas500

sksurv.datasets.load_whas500()[source]

Load and return the Worcester Heart Attack Study dataset

The dataset has 500 samples and 14 features. The endpoint is death, which occurred for 215 patients (43.0%).

See [1], [2] for further description.

Returns:
  • x (pandas.DataFrame) – The measurements for each patient.
  • y (structured array with 2 fields) – fstat: boolean indicating whether the endpoint has been reached or the event time is right censored.

    lenfol: total length of follow-up (days from hospital admission date to date of last follow-up)

References

[1]https://web.archive.org/web/20170114043458/http://www.umass.edu/statdata/statdata/data/
[2]Hosmer, D., Lemeshow, S., May, S.: “Applied Survival Analysis: Regression Modeling of Time to Event Data.” John Wiley & Sons, Inc. (2008)