# sksurv.linear_model.CoxPHSurvivalAnalysis¶

class sksurv.linear_model.CoxPHSurvivalAnalysis(alpha=0, ties='breslow', n_iter=100, tol=1e-09, verbose=0)[source]

Cox proportional hazards model.

There are two possible choices for handling tied event times. The default is Breslow’s method, which considers each of the events at a given time as distinct. Efron’s method is more accurate if there are a large number of ties. When the number of ties is small, the estimated coefficients by Breslow’s and Efron’s method are quite close. Uses Newton-Raphson optimization.

See 1, 2, 3 for further description.

Parameters
• alpha (float, ndarray of shape (n_features,), optional, default: 0) – Regularization parameter for ridge regression penalty. If a single float, the same penalty is used for all features. If an array, there must be one penalty for each feature. If you want to include a subset of features without penalization, set the corresponding entries to 0.

• ties ("breslow" | "efron", optional, default: "breslow") – The method to handle tied event times. If there are no tied event times all the methods are equivalent.

• n_iter (int, optional, default: 100) – Maximum number of iterations.

• tol (float, optional, default: 1e-9) –

Convergence criteria. Convergence is based on the negative log-likelihood:

|1 - (new neg. log-likelihood / old neg. log-likelihood) | < tol


• verbose (int, optional, default: 0) – Specified the amount of additional debug information during optimization.

coef_

Coefficients of the model

Type

ndarray, shape = (n_features,)

cum_baseline_hazard_

Estimated baseline cumulative hazard function.

Type

sksurv.functions.StepFunction

baseline_survival_

Estimated baseline survival function.

Type

sksurv.functions.StepFunction

sksurv.linear_model.CoxnetSurvivalAnalysis

Cox proportional hazards model with l1 (LASSO) and l2 (ridge) penalty.

References

1

Cox, D. R. Regression models and life tables (with discussion). Journal of the Royal Statistical Society. Series B, 34, 187-220, 1972.

2

Breslow, N. E. Covariance Analysis of Censored Survival Data. Biometrics 30 (1974): 89–99.

3

Efron, B. The Efficiency of Cox’s Likelihood Function for Censored Data. Journal of the American Statistical Association 72 (1977): 557–565.

__init__(alpha=0, ties='breslow', n_iter=100, tol=1e-09, verbose=0)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

 __init__([alpha, ties, n_iter, tol, verbose]) Initialize self. fit(X, y) Minimize negative partial log-likelihood for provided data. Predict risk scores. Predict cumulative hazard function. Predict survival function. score(X, y) Returns the concordance index of the prediction.

Attributes

fit(X, y)[source]

Minimize negative partial log-likelihood for provided data.

Parameters
• X (array-like, shape = (n_samples, n_features)) – Data matrix

• y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Returns

Return type

self

predict(X)[source]

Predict risk scores.

Parameters

X (array-like, shape = (n_samples, n_features)) – Data matrix.

Returns

risk_score – Predicted risk scores.

Return type

array, shape = (n_samples,)

predict_cumulative_hazard_function(X)[source]

Predict cumulative hazard function.

The cumulative hazard function for an individual with feature vector $$x$$ is defined as

$H(t \mid x) = \exp(x^\top \beta) H_0(t) ,$

where $$H_0(t)$$ is the baseline hazard function, estimated by Breslow’s estimator.

Parameters

X (array-like, shape = (n_samples, n_features)) – Data matrix.

Returns

cum_hazard – Predicted cumulative hazard functions.

Return type

ndarray of sksurv.functions.StepFunction, shape = (n_samples,)

Examples

>>> import matplotlib.pyplot as plt
>>> from sksurv.linear_model import CoxPHSurvivalAnalysis


>>> X, y = load_whas500()
>>> X = X.astype(float)


Fit the model.

>>> estimator = CoxPHSurvivalAnalysis().fit(X, y)


Estimate the cumulative hazard function for the first 10 samples.

>>> chf_funcs = estimator.predict_cumulative_hazard_function(X.iloc[:10])


Plot the estimated cumulative hazard functions.

>>> for fn in chf_funcs:
...     plt.step(fn.x, fn(fn.x), where="post")
...
>>> plt.ylim(0, 1)
>>> plt.show()

predict_survival_function(X)[source]

Predict survival function.

The survival function for an individual with feature vector $$x$$ is defined as

$S(t \mid x) = S_0(t)^{\exp(x^\top \beta)} ,$

where $$S_0(t)$$ is the baseline survival function, estimated by Breslow’s estimator.

Parameters

X (array-like, shape = (n_samples, n_features)) – Data matrix.

Returns

survival – Predicted survival functions.

Return type

ndarray of sksurv.functions.StepFunction, shape = (n_samples,)

Examples

>>> import matplotlib.pyplot as plt
>>> from sksurv.linear_model import CoxPHSurvivalAnalysis


>>> X, y = load_whas500()
>>> X = X.astype(float)


Fit the model.

>>> estimator = CoxPHSurvivalAnalysis().fit(X, y)


Estimate the survival function for the first 10 samples.

>>> surv_funcs = estimator.predict_survival_function(X.iloc[:10])


Plot the estimated survival functions.

>>> for fn in surv_funcs:
...     plt.step(fn.x, fn(fn.x), where="post")
...
>>> plt.ylim(0, 1)
>>> plt.show()

score(X, y)[source]

Returns the concordance index of the prediction.

Parameters
• X (array-like, shape = (n_samples, n_features)) – Test samples.

• y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Returns

cindex – Estimated concordance index.

Return type

float