sksurv.svm.HingeLossSurvivalSVM#
- class sksurv.svm.HingeLossSurvivalSVM(alpha=1.0, *, solver='ecos', kernel='linear', gamma=None, degree=3, coef0=1, kernel_params=None, pairs='all', verbose=False, timeit=None, max_iter=None)[source]#
Naive implementation of kernel survival support vector machine.
This implementation creates a new set of samples by building the difference between any two feature vectors in the original data. This approach requires \(O(\text{n_samples}^4)\) space and \(O(\text{n_samples}^6 \cdot \text{n_features})\) time, making it computationally intensive for large datasets.
The optimization problem is formulated as:
\[ \begin{align}\begin{aligned}\begin{split}\min_{\mathbf{w}}\quad \frac{1}{2} \lVert \mathbf{w} \rVert_2^2 + \gamma \sum_{i = 1}^n \xi_i \\ \text{subject to}\quad \mathbf{w}^\top \phi(\mathbf{x})_i - \mathbf{w}^\top \phi(\mathbf{x})_j \geq 1 - \xi_{ij},\quad \forall (i, j) \in \mathcal{P}, \\ \xi_i \geq 0,\quad \forall (i, j) \in \mathcal{P}.\end{split}\\\mathcal{P} = \{ (i, j) \mid y_i > y_j \land \delta_j = 1 \}_{i,j=1,\dots,n}.\end{aligned}\end{align} \]See [1], [2], [3] for further description.
- Parameters:
alpha (float, optional, default: 1) – Weight of penalizing the hinge loss in the objective function. Must be greater than 0.
solver ({'ecos', 'osqp'}, optional, default: 'ecos') – Which quadratic program solver to use.
kernel (str or callable, optional, default: 'linear') – Kernel mapping used internally. This parameter is directly passed to
sklearn.metrics.pairwise.pairwise_kernels(). If kernel is a string, it must be one of the metrics in sklearn.pairwise.PAIRWISE_KERNEL_FUNCTIONS or “precomputed”. If kernel is “precomputed”, X is assumed to be a kernel matrix. Alternatively, if kernel is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number. This means that callables fromsklearn.metrics.pairwiseare not allowed, as they operate on matrices, not single samples. Use the string identifying the kernel instead.gamma (float or None, optional, default: None) – Gamma parameter for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for
sklearn.metrics.pairwise. Ignored by other kernels.degree (int, optional, default: 3) – Degree of the polynomial kernel. Ignored by other kernels.
coef0 (float, optional, default: 1) – Zero coefficient for polynomial and sigmoid kernels. Ignored by other kernels.
kernel_params (dict or None, optional, default: None) – Additional parameters (keyword arguments) for kernel function passed as callable object.
pairs ({'all', 'nearest', 'next'}, optional, default: 'all') –
Which constraints to use in the optimization problem.
all: Use all comparable pairs. Scales quadratically in number of samples.
nearest: Only considers comparable pairs \((i, j)\) where \(j\) is the uncensored sample with highest survival time smaller than \(y_i\). Scales linearly in number of samples (cf.
sksurv.svm.MinlipSurvivalAnalysis).next: Only compare against direct nearest neighbor according to observed time, disregarding its censoring status. Scales linearly in number of samples.
verbose (bool, optional, default: False) – If
True, enable verbose output of the solver.timeit (bool, int, or None, optional, default: False) – If
Trueor a non-zero integer, the time taken for optimization is measured. If an integer is provided, the optimization is repeated that many times. Results can be accessed from thetimings_attribute.max_iter (int or None, optional, default: None) – The maximum number of iterations taken for the solvers to converge. If
None, use solver’s default value.
- X_fit_#
Training data.
- Type:
ndarray, shape = (n_samples, n_features_in_)
- coef_#
Coefficients of the features in the decision function.
- Type:
ndarray, shape = (n_samples,), dtype = float
- n_features_in_#
Number of features seen during
fit.- Type:
int
- feature_names_in_#
Names of features seen during
fit. Defined only when X has feature names that are all strings.- Type:
ndarray, shape = (n_features_in_,), dtype = object
- n_iter_#
Number of iterations run by the optimization routine to fit the model.
- Type:
int
See also
sksurv.svm.NaiveSurvivalSVMThe linear naive survival SVM based on liblinear.
References
- __init__(alpha=1.0, *, solver='ecos', kernel='linear', gamma=None, degree=3, coef0=1, kernel_params=None, pairs='all', verbose=False, timeit=None, max_iter=None)[source]#
Methods
__init__([alpha, solver, kernel, gamma, ...])fit(X, y)Build a MINLIP survival model from training data.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X)Predict risk score of experiencing an event.
score(X, y)Returns the concordance index of the prediction.
set_params(**params)Set the parameters of this estimator.
- fit(X, y)[source]#
Build a MINLIP survival model from training data.
- Parameters:
X (array-like, shape = (n_samples, n_features)) – Data matrix.
y (structured array, shape = (n_samples,)) – A structured array with two fields. The first field is a boolean where
Trueindicates an event andFalseindicates right-censoring. The second field is a float with the time of event or time of censoring.
- Return type:
self
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequestencapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- predict(X)[source]#
Predict risk score of experiencing an event.
Higher values indicate an increased risk of experiencing an event, lower values a decreased risk of experiencing an event. The scores have no unit and are only meaningful to rank samples by their risk of experiencing an event.
- Parameters:
X (array-like, shape = (n_samples, n_features)) – The input samples.
- Returns:
y – Predicted risk.
- Return type:
ndarray, shape = (n_samples,)
- score(X, y)[source]#
Returns the concordance index of the prediction.
- Parameters:
X (array-like, shape = (n_samples, n_features)) – Test samples.
y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
- Returns:
cindex – Estimated concordance index.
- Return type:
float
See also
sksurv.metrics.concordance_index_censoredComputes the concordance index.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance