sksurv.linear_model.
CoxPHSurvivalAnalysis
Cox proportional hazards model.
There are two possible choices for handling tied event times. The default is Breslow’s method, which considers each of the events at a given time as distinct. Efron’s method is more accurate if there are a large number of ties. When the number of ties is small, the estimated coefficients by Breslow’s and Efron’s method are quite close. Uses Newton-Raphson optimization.
See 1, 2, 3 for further description.
alpha (float, ndarray of shape (n_features,), optional, default: 0) – Regularization parameter for ridge regression penalty. If a single float, the same penalty is used for all features. If an array, there must be one penalty for each feature. If you want to include a subset of features without penalization, set the corresponding entries to 0.
ties ("breslow" | "efron", optional, default: "breslow") – The method to handle tied event times. If there are no tied event times all the methods are equivalent.
n_iter (int, optional, default: 100) – Maximum number of iterations.
tol (float, optional, default: 1e-9) –
Convergence criteria. Convergence is based on the negative log-likelihood:
|1 - (new neg. log-likelihood / old neg. log-likelihood) | < tol
verbose (int, optional, default: 0) – Specified the amount of additional debug information during optimization.
coef_
Coefficients of the model
ndarray, shape = (n_features,)
cum_baseline_hazard_
Estimated baseline cumulative hazard function.
sksurv.functions.StepFunction
baseline_survival_
Estimated baseline survival function.
See also
sksurv.linear_model.CoxnetSurvivalAnalysis
Cox proportional hazards model with l1 (LASSO) and l2 (ridge) penalty.
References
Cox, D. R. Regression models and life tables (with discussion). Journal of the Royal Statistical Society. Series B, 34, 187-220, 1972.
Breslow, N. E. Covariance Analysis of Censored Survival Data. Biometrics 30 (1974): 89–99.
Efron, B. The Efficiency of Cox’s Likelihood Function for Censored Data. Journal of the American Statistical Association 72 (1977): 557–565.
__init__
Initialize self. See help(type(self)) for accurate signature.
Methods
__init__([alpha, ties, n_iter, tol, verbose])
Initialize self.
fit(X, y)
fit
Minimize negative partial log-likelihood for provided data.
predict(X)
predict
Predict risk scores.
predict_cumulative_hazard_function(X)
predict_cumulative_hazard_function
Predict cumulative hazard function.
predict_survival_function(X)
predict_survival_function
Predict survival function.
score(X, y)
score
Returns the concordance index of the prediction.
Attributes
X (array-like, shape = (n_samples, n_features)) – Data matrix
y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
self
X (array-like, shape = (n_samples, n_features)) – Data matrix.
risk_score – Predicted risk scores.
array, shape = (n_samples,)
The cumulative hazard function for an individual with feature vector \(x\) is defined as
where \(H_0(t)\) is the baseline hazard function, estimated by Breslow’s estimator.
cum_hazard – Predicted cumulative hazard functions.
ndarray of sksurv.functions.StepFunction, shape = (n_samples,)
Examples
>>> import matplotlib.pyplot as plt >>> from sksurv.datasets import load_whas500 >>> from sksurv.linear_model import CoxPHSurvivalAnalysis
Load the data.
>>> X, y = load_whas500() >>> X = X.astype(float)
Fit the model.
>>> estimator = CoxPHSurvivalAnalysis().fit(X, y)
Estimate the cumulative hazard function for the first 10 samples.
>>> chf_funcs = estimator.predict_cumulative_hazard_function(X.iloc[:10])
Plot the estimated cumulative hazard functions.
>>> for fn in chf_funcs: ... plt.step(fn.x, fn(fn.x), where="post") ... >>> plt.ylim(0, 1) >>> plt.show()
The survival function for an individual with feature vector \(x\) is defined as
where \(S_0(t)\) is the baseline survival function, estimated by Breslow’s estimator.
survival – Predicted survival functions.
Estimate the survival function for the first 10 samples.
>>> surv_funcs = estimator.predict_survival_function(X.iloc[:10])
Plot the estimated survival functions.
>>> for fn in surv_funcs: ... plt.step(fn.x, fn(fn.x), where="post") ... >>> plt.ylim(0, 1) >>> plt.show()
X (array-like, shape = (n_samples, n_features)) – Test samples.
cindex – Estimated concordance index.
float