sksurv.svm.NaiveSurvivalSVM

class sksurv.svm.NaiveSurvivalSVM(penalty='l2', loss='squared_hinge', dual=False, tol=0.0001, alpha=1.0, verbose=0, random_state=None, max_iter=1000)

Naive version of linear Survival Support Vector Machine.

Uses regular linear support vector classifier (liblinear). A new set of samples is created by building the difference between any two feature vectors in the original data, thus this version requires O(n_samples^2) space.

See sksurv.svm.HingeLossSurvivalSVM for the kernel naive survival SVM.

\[ \begin{align}\begin{aligned}\begin{split}\min_{\mathbf{w}}\quad \frac{1}{2} \lVert \mathbf{w} \rVert_2^2 + \gamma \sum_{i = 1}^n \xi_i \\ \text{subject to}\quad \mathbf{w}^\top \mathbf{x}_i - \mathbf{w}^\top \mathbf{x}_j \geq 1 - \xi_{ij},\quad \forall (i, j) \in \mathcal{P}, \\ \xi_i \geq 0,\quad \forall (i, j) \in \mathcal{P}.\end{split}\\\mathcal{P} = \{ (i, j) \mid y_i > y_j \land \delta_j = 1 \}_{i,j=1,\dots,n}.\end{aligned}\end{align} \]
Parameters:
alpha : float, positive, default: 1.0

Weight of penalizing the squared hinge loss in the objective function.

loss : string, ‘hinge’ or ‘squared_hinge’, default: ‘squared_hinge’

Specifies the loss function. ‘hinge’ is the standard SVM loss (used e.g. by the SVC class) while ‘squared_hinge’ is the square of the hinge loss.

penalty : ‘l1’ | ‘l2’, default: ‘l2’

Specifies the norm used in the penalization. The ‘l2’ penalty is the standard used in SVC. The ‘l1’ leads to coef_ vectors that are sparse.

dual : bool, default: True

Select the algorithm to either solve the dual or primal optimization problem. Prefer dual=False when n_samples > n_features.

tol : float, optional, default: 1e-4

Tolerance for stopping criteria.

verbose : int, default: 0

Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in liblinear that, if enabled, may not work properly in a multithreaded context.

random_state : int seed, RandomState instance, or None, default: None

The seed of the pseudo random number generator to use when shuffling the data.

max_iter : int, default: 1000

The maximum number of iterations to be run.

References

[1]Van Belle, V., Pelckmans, K., Suykens, J. A., & Van Huffel, S. Support Vector Machines for Survival Analysis. In Proc. of the 3rd Int. Conf. on Computational Intelligence in Medicine and Healthcare (CIMED). 1-8. 2007
[2]Evers, L., Messow, C.M., “Sparse kernel methods for high-dimensional survival data”, Bioinformatics 24(14), 1632-8, 2008.
__init__(penalty='l2', loss='squared_hinge', dual=False, tol=0.0001, alpha=1.0, verbose=0, random_state=None, max_iter=1000)

Methods

__init__([penalty, loss, dual, tol, alpha, …])
fit(X, y[, sample_weight]) Build a survival support vector machine model from training data.
predict(X) Rank samples according to survival times
score(X, y)
fit(X, y, sample_weight=None)

Build a survival support vector machine model from training data.

Parameters:
X : array-like, shape = (n_samples, n_features)

Data matrix.

y : structured array, shape = (n_samples,)

A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.

sample_weight : array-like, shape = (n_samples,), optional

Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

Returns:
self
predict(X)

Rank samples according to survival times

Lower ranks indicate shorter survival, higher ranks longer survival.

Parameters:
X : array-like, shape = (n_samples, n_features,)

The input samples.

Returns:
y : ndarray, shape = (n_samples,)

Predicted ranks.