sksurv.svm.FastSurvivalSVM#
- class sksurv.svm.FastSurvivalSVM(alpha=1, *, rank_ratio=1.0, fit_intercept=False, max_iter=20, verbose=False, tol=None, optimizer=None, random_state=None, timeit=False)[source]#
Efficient Training of linear Survival Support Vector Machine
Training data consists of n triplets \((\mathbf{x}_i, y_i, \delta_i)\), where \(\mathbf{x}_i\) is a d-dimensional feature vector, \(y_i > 0\) the survival time or time of censoring, and \(\delta_i \in \{0,1\}\) the binary event indicator. Using the training data, the objective is to minimize the following function:
\[ \begin{align}\begin{aligned} \arg \min_{\mathbf{w}, b} \frac{1}{2} \mathbf{w}^\top \mathbf{w} + \frac{\alpha}{2} \left[ r \sum_{i,j \in \mathcal{P}} \max(0, 1 - (\mathbf{w}^\top \mathbf{x}_i - \mathbf{w}^\top \mathbf{x}_j))^2 + (1 - r) \sum_{i=0}^n \left( \zeta_{\mathbf{w}, b} (y_i, x_i, \delta_i) \right)^2 \right]\\\begin{split}\zeta_{\mathbf{w},b} (y_i, \mathbf{x}_i, \delta_i) = \begin{cases} \max(0, y_i - \mathbf{w}^\top \mathbf{x}_i - b) \quad \text{if $\delta_i = 0$,} \\ y_i - \mathbf{w}^\top \mathbf{x}_i - b \quad \text{if $\delta_i = 1$,} \\ \end{cases}\end{split}\\\mathcal{P} = \{ (i, j) \mid y_i > y_j \land \delta_j = 1 \}_{i,j=1,\dots,n}\end{aligned}\end{align} \]The hyper-parameter \(\alpha > 0\) determines the amount of regularization to apply: a smaller value increases the amount of regularization and a higher value reduces the amount of regularization. The hyper-parameter \(r \in [0; 1]\) determines the trade-off between the ranking objective and the regression objective. If \(r = 1\) it reduces to the ranking objective, and if \(r = 0\) to the regression objective. If the regression objective is used, survival/censoring times are log-transform and thus cannot be zero or negative.
See the User Guide and [1] for further description.
- Parameters:
alpha (float, positive, default: 1) – Weight of penalizing the squared hinge loss in the objective function
rank_ratio (float, optional, default: 1.0) – Mixing parameter between regression and ranking objective with
0 <= rank_ratio <= 1
. Ifrank_ratio = 1
, only ranking is performed, ifrank_ratio = 0
, only regression is performed. A non-zero value is only allowed if optimizer is one of ‘avltree’, ‘rbtree’, or ‘direct-count’.fit_intercept (boolean, optional, default: False) – Whether to calculate an intercept for the regression model. If set to
False
, no intercept will be calculated. Has no effect ifrank_ratio = 1
, i.e., only ranking is performed.max_iter (int, optional, default: 20) – Maximum number of iterations to perform in Newton optimization
verbose (bool, optional, default: False) – Whether to print messages during optimization
tol (float or None, optional, default: None) – Tolerance for termination. For detailed control, use solver-specific options.
optimizer ({'avltree', 'direct-count', 'PRSVM', 'rbtree', 'simple'}, optional, default: 'avltree') – Which optimizer to use.
random_state (int or
numpy.random.RandomState
instance, optional) – Random number generator (used to resolve ties in survival times).timeit (False, int or None, default: None) – If non-zero value is provided the time it takes for optimization is measured. The given number of repetitions are performed. Results can be accessed from the
optimizer_result_
attribute.
- coef_#
Coefficients of the features in the decision function.
- Type:
ndarray, shape = (n_features,)
- optimizer_result_#
Stats returned by the optimizer. See
scipy.optimize.optimize.OptimizeResult
.- Type:
scipy.optimize.optimize.OptimizeResult
- n_features_in_#
Number of features seen during
fit
.- Type:
int
- feature_names_in_#
Names of features seen during
fit
. Defined only when X has feature names that are all strings.- Type:
ndarray of shape (n_features_in_,)
- n_iter_#
Number of iterations run by the optimization routine to fit the model.
- Type:
int
See also
FastKernelSurvivalSVM
Fast implementation for arbitrary kernel functions.
References
- __init__(alpha=1, *, rank_ratio=1.0, fit_intercept=False, max_iter=20, verbose=False, tol=None, optimizer=None, random_state=None, timeit=False)[source]#
Methods
__init__
([alpha, rank_ratio, fit_intercept, ...])fit
(X, y)Build a survival support vector machine model from training data.
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
predict
(X)Rank samples according to survival times
score
(X, y)Returns the concordance index of the prediction.
set_params
(**params)Set the parameters of this estimator.
Attributes
- fit(X, y)[source]#
Build a survival support vector machine model from training data.
- Parameters:
X (array-like, shape = (n_samples, n_features)) – Data matrix.
y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
- Return type:
self
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequest
encapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- predict(X)[source]#
Rank samples according to survival times
Lower ranks indicate shorter survival, higher ranks longer survival.
- Parameters:
X (array-like, shape = (n_samples, n_features)) – The input samples.
- Returns:
y – Predicted ranks.
- Return type:
ndarray, shape = (n_samples,)
- score(X, y)[source]#
Returns the concordance index of the prediction.
- Parameters:
X (array-like, shape = (n_samples, n_features)) – Test samples.
y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
- Returns:
cindex – Estimated concordance index.
- Return type:
float
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance