sksurv.svm.
FastSurvivalSVM
Efficient Training of linear Survival Support Vector Machine
Training data consists of n triplets \((\mathbf{x}_i, y_i, \delta_i)\), where \(\mathbf{x}_i\) is a d-dimensional feature vector, \(y_i > 0\) the survival time or time of censoring, and \(\delta_i \in \{0,1\}\) the binary event indicator. Using the training data, the objective is to minimize the following function:
The hyper-parameter \(\alpha > 0\) determines the amount of regularization to apply: a smaller value increases the amount of regularization and a higher value reduces the amount of regularization. The hyper-parameter \(r \in [0; 1]\) determines the trade-off between the ranking objective and the regresson objective. If \(r = 1\) it reduces to the ranking objective, and if \(r = 0\) to the regression objective. If the regression objective is used, survival/censoring times are log-transform and thus cannot be zero or negative.
See the User Guide and 1 for further description.
alpha (float, positive, default: 1) – Weight of penalizing the squared hinge loss in the objective function
rank_ratio (float, optional, default: 1.0) – Mixing parameter between regression and ranking objective with 0 <= rank_ratio <= 1. If rank_ratio = 1, only ranking is performed, if rank_ratio = 0, only regression is performed. A non-zero value is only allowed if optimizer is one of ‘avltree’, ‘rbtree’, or ‘direct-count’.
0 <= rank_ratio <= 1
rank_ratio = 1
rank_ratio = 0
fit_intercept (boolean, optional, default: False) – Whether to calculate an intercept for the regression model. If set to False, no intercept will be calculated. Has no effect if rank_ratio = 1, i.e., only ranking is performed.
False
max_iter (int, optional, default: 20) – Maximum number of iterations to perform in Newton optimization
verbose (bool, optional, default: False) – Whether to print messages during optimization
tol (float, optional) – Tolerance for termination. For detailed control, use solver-specific options.
optimizer ("avltree" | "direct-count" | "PRSVM" | "rbtree" | "simple", optional, default: avltree) – Which optimizer to use.
random_state (int or numpy.random.RandomState instance, optional) – Random number generator (used to resolve ties in survival times).
numpy.random.RandomState
timeit (False or int) – If non-zero value is provided the time it takes for optimization is measured. The given number of repetitions are performed. Results can be accessed from the optimizer_result_ attribute.
optimizer_result_
coef_
Coefficients of the features in the decision function.
ndarray, shape = (n_features,)
Stats returned by the optimizer. See scipy.optimize.optimize.OptimizeResult.
scipy.optimize.optimize.OptimizeResult
See also
FastKernelSurvivalSVM
Fast implementation for arbitrary kernel functions.
References
Pölsterl, S., Navab, N., and Katouzian, A., “Fast Training of Support Vector Machines for Survival Analysis”, Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2015, Porto, Portugal, Lecture Notes in Computer Science, vol. 9285, pp. 243-259 (2015)
__init__
Initialize self. See help(type(self)) for accurate signature.
Methods
__init__([alpha, rank_ratio, fit_intercept, …])
Initialize self.
fit(X, y)
fit
Build a survival support vector machine model from training data.
predict(X)
predict
Rank samples according to survival times
score(X, y)
score
Returns the concordance index of the prediction.
X (array-like, shape = (n_samples, n_features)) – Data matrix.
y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
self
Lower ranks indicate shorter survival, higher ranks longer survival.
X (array-like, shape = (n_samples, n_features)) – The input samples.
y – Predicted ranks.
ndarray, shape = (n_samples,)
X (array-like, shape = (n_samples, n_features)) – Test samples.
cindex – Estimated concordance index.
float