sksurv.datasets.load_cgvhd#
- sksurv.datasets.load_cgvhd()[source]#
Load and return data from multicentre randomized clinical trial initiated for patients with a myeloid malignancy who were to undergo an allogeneic bone marrow transplant.
The dataset is a 100 patient subsample of the full data set. See [2] for further details.
Index
Name
Description
Encoding
1
dx
Diagnosis
AML=acute myeloid leukaemiaCML=chronic myeloid leukaemia2
tx
Randomized treatment
BM=cell harvested from the bone marrowPB=cell harvested from peripheral blood3
extent
Extent of disease
L=limited, E=extensive
4
agvhdgd
Grade of acute GVHD
5
age
Age
Years
6
survtime
Time from date of transplant to death or last follow-up
Years
7
reltime
Time from date of transplant to relapse or last follow-up
Years
8
agvhtime
Time from date of transplant to acute GVHD or last follow-up
Years
9
cgvhtime
Time from date of transplant to chronic GVHD or last follow-up
Years
10
stat
Status
1=Dead, 0=Alive
11
rcens
Relapse
1=Yes, 0=No
12
agvh
Acute GVHD
1=Yes, 0=No
13
cgvh
Chronic GVHD
1=Yes, 0=No
14
stnum
patient ID
Columns 6,7 and 9 contain the time to death, relapse and CGVHD calculated in years (survtime, reltime, cgvhtime) and the respective indicator variables are in columns 10,11 and 13 (stat, rcens, cgvh). The earliest time that any of these events happened is calculated by taking the minimum of the observed times. The censoring variable cens is coded as 0 when no events were observed, 1 if CGVHD was observed as first event, 2 if a relapse was observed as the first event and 3 if death occurred before either of the events: The endpoint (status) is therefore defined as
Value
Description
Count (%)
0
Survival (Right-censored data)
4 patients (4%)
1
Chronic graft versus host disease (CGVHD)
86 events (86%)
2
Relapse (TRM)
5 events (5%)
3
Death
5 events (5%)
The dataset has been obtained from [1].
- Returns:
x (pandas.DataFrame) – The measurements for each patient.
y (structured array with 2 fields) – status: Integer indicating the endpoint: 0: right censored data; 1: CGVHD; 2: relapse; 3: death.
ftime: total length of follow-up or time of event.
References