sksurv.svm.FastSurvivalSVM#

class sksurv.svm.FastSurvivalSVM(alpha=1, *, rank_ratio=1.0, fit_intercept=False, max_iter=20, verbose=False, tol=None, optimizer=None, random_state=None, timeit=False)[source]#

Efficient Training of linear Survival Support Vector Machine

Training data consists of n triplets $$(\mathbf{x}_i, y_i, \delta_i)$$, where $$\mathbf{x}_i$$ is a d-dimensional feature vector, $$y_i > 0$$ the survival time or time of censoring, and $$\delta_i \in \{0,1\}$$ the binary event indicator. Using the training data, the objective is to minimize the following function:

\begin{align}\begin{aligned} \arg \min_{\mathbf{w}, b} \frac{1}{2} \mathbf{w}^\top \mathbf{w} + \frac{\alpha}{2} \left[ r \sum_{i,j \in \mathcal{P}} \max(0, 1 - (\mathbf{w}^\top \mathbf{x}_i - \mathbf{w}^\top \mathbf{x}_j))^2 + (1 - r) \sum_{i=0}^n \left( \zeta_{\mathbf{w}, b} (y_i, x_i, \delta_i) \right)^2 \right]\\\begin{split}\zeta_{\mathbf{w},b} (y_i, \mathbf{x}_i, \delta_i) = \begin{cases} \max(0, y_i - \mathbf{w}^\top \mathbf{x}_i - b) \quad \text{if \delta_i = 0,} \\ y_i - \mathbf{w}^\top \mathbf{x}_i - b \quad \text{if \delta_i = 1,} \\ \end{cases}\end{split}\\\mathcal{P} = \{ (i, j) \mid y_i > y_j \land \delta_j = 1 \}_{i,j=1,\dots,n}\end{aligned}\end{align}

The hyper-parameter $$\alpha > 0$$ determines the amount of regularization to apply: a smaller value increases the amount of regularization and a higher value reduces the amount of regularization. The hyper-parameter $$r \in [0; 1]$$ determines the trade-off between the ranking objective and the regression objective. If $$r = 1$$ it reduces to the ranking objective, and if $$r = 0$$ to the regression objective. If the regression objective is used, survival/censoring times are log-transform and thus cannot be zero or negative.

See the User Guide and 1 for further description.

Parameters
• alpha (float, positive, default: 1) – Weight of penalizing the squared hinge loss in the objective function

• rank_ratio (float, optional, default: 1.0) – Mixing parameter between regression and ranking objective with 0 <= rank_ratio <= 1. If rank_ratio = 1, only ranking is performed, if rank_ratio = 0, only regression is performed. A non-zero value is only allowed if optimizer is one of ‘avltree’, ‘rbtree’, or ‘direct-count’.

• fit_intercept (boolean, optional, default: False) – Whether to calculate an intercept for the regression model. If set to False, no intercept will be calculated. Has no effect if rank_ratio = 1, i.e., only ranking is performed.

• max_iter (int, optional, default: 20) – Maximum number of iterations to perform in Newton optimization

• verbose (bool, optional, default: False) – Whether to print messages during optimization

• tol (float or None, optional, default: None) – Tolerance for termination. For detailed control, use solver-specific options.

• optimizer ({'avltree', 'direct-count', 'PRSVM', 'rbtree', 'simple'}, optional, default: 'avltree') – Which optimizer to use.

• random_state (int or numpy.random.RandomState instance, optional) – Random number generator (used to resolve ties in survival times).

• timeit (False, int or None, default: None) – If non-zero value is provided the time it takes for optimization is measured. The given number of repetitions are performed. Results can be accessed from the optimizer_result_ attribute.

coef_#

Coefficients of the features in the decision function.

Type

ndarray, shape = (n_features,)

optimizer_result_#

Stats returned by the optimizer. See scipy.optimize.optimize.OptimizeResult.

Type

scipy.optimize.optimize.OptimizeResult

n_features_in_#

Number of features seen during fit.

Type

int

feature_names_in_#

Names of features seen during fit. Defined only when X has feature names that are all strings.

Type

ndarray of shape (n_features_in_,)

n_iter_#

Number of iterations run by the optimization routine to fit the model.

Type

int

FastKernelSurvivalSVM

Fast implementation for arbitrary kernel functions.

References

1

Pölsterl, S., Navab, N., and Katouzian, A., “Fast Training of Support Vector Machines for Survival Analysis”, Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2015, Porto, Portugal, Lecture Notes in Computer Science, vol. 9285, pp. 243-259 (2015)

__init__(alpha=1, *, rank_ratio=1.0, fit_intercept=False, max_iter=20, verbose=False, tol=None, optimizer=None, random_state=None, timeit=False)[source]#

Methods

 __init__([alpha, rank_ratio, fit_intercept, ...]) fit(X, y) Build a survival support vector machine model from training data. get_params([deep]) Get parameters for this estimator. Rank samples according to survival times score(X, y) Returns the concordance index of the prediction. set_params(**params) Set the parameters of this estimator.

Attributes

fit(X, y)[source]#

Build a survival support vector machine model from training data.

Parameters
• X (array-like, shape = (n_samples, n_features)) – Data matrix.

• y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Return type

self

get_params(deep=True)#

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

predict(X)[source]#

Rank samples according to survival times

Lower ranks indicate shorter survival, higher ranks longer survival.

Parameters

X (array-like, shape = (n_samples, n_features)) – The input samples.

Returns

y – Predicted ranks.

Return type

ndarray, shape = (n_samples,)

score(X, y)[source]#

Returns the concordance index of the prediction.

Parameters
• X (array-like, shape = (n_samples, n_features)) – Test samples.

• y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

Returns

cindex – Estimated concordance index.

Return type

float

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance