sksurv.meta.EnsembleSelection#
- class sksurv.meta.EnsembleSelection(base_estimators, scorer=None, n_estimators=0.2, min_score=0.2, correlation='pearson', min_correlation=0.6, cv=None, n_jobs=1, verbose=0)[source]#
Ensemble selection for survival analysis that accounts for a score and correlations between predictions.
The ensemble is pruned during training only according to the specified score (accuracy) and additionally for prediction according to the correlation between predictions (diversity).
The hillclimbing is based on cross-validation to avoid having to create a separate validation set.
See 1, 2, 3 for further description.
- Parameters
base_estimators (list) – List of (name, estimator) tuples (implementing fit/predict) that are part of the ensemble.
scorer (callable) – Function with signature
func(estimator, X_test, y_test, **test_predict_params)
that evaluates the error of the prediction on the test data. The function should return a scalar value. Larger values of the score are assumed to be better.n_estimators (float or int, optional, default: 0.2) – If a float, the percentage of estimators in the ensemble to retain, if an int the absolute number of estimators to retain.
min_score (float, optional, default: 0.66) – Threshold for pruning estimators based on scoring metric. After fit, only estimators with a score above min_score are retained.
min_correlation (float, optional, default: 0.6) – Threshold for Pearson’s correlation coefficient that determines when predictions of two estimators are significantly correlated.
cv (int, a cv generator instance, or None, optional) – The input specifying which cv generator to use. It can be an integer, in which case it is the number of folds in a KFold, None, in which case 3 fold is used, or another object, that will then be used as a cv generator. The generator has to ensure that each sample is only used once for testing.
n_jobs (int, optional, default: 1) – Number of jobs to run in parallel.
verbose (integer) – Controls the verbosity: the higher, the more messages.
- scores_#
Array of scores (relative to best performing estimator)
- Type
ndarray, shape = (n_base_estimators,)
- fitted_models_#
Selected models during training based on scorer.
- Type
ndarray
- n_features_in_#
Number of features seen during
fit
.- Type
int
- feature_names_in_#
Names of features seen during
fit
. Defined only when X has feature names that are all strings.- Type
ndarray of shape (n_features_in_,)
References
- 1
Pölsterl, S., Gupta, P., Wang, L., Conjeti, S., Katouzian, A., and Navab, N., “Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients”. F1000Research, vol. 5, no. 2676, 2016
- 2
Caruana, R., Munson, A., Niculescu-Mizil, A. “Getting the most out of ensemble selection”. 6th IEEE International Conference on Data Mining, 828-833, 2006
- 3
Rooney, N., Patterson, D., Anand, S., Tsymbal, A. “Dynamic integration of regression models. International Workshop on Multiple Classifier Systems”. Lecture Notes in Computer Science, vol. 3181, 164-173, 2004
- __init__(base_estimators, scorer=None, n_estimators=0.2, min_score=0.2, correlation='pearson', min_correlation=0.6, cv=None, n_jobs=1, verbose=0)[source]#
Methods
__init__
(base_estimators[, scorer, ...])fit
(X[, y])Fit ensemble of models
get_params
([deep])Get the parameters of an estimator from the ensemble.
predict
(X)Perform prediction.
Perform prediction.
Perform prediction.
score
(X, y)Returns the concordance index of the prediction.
set_params
(**params)Set the parameters of an estimator from the ensemble.
Attributes
steps
- fit(X, y=None, **fit_params)[source]#
Fit ensemble of models
- Parameters
X (array-like, shape = (n_samples, n_features)) – Training data.
y (array-like, optional) – Target data if base estimators are supervised.
- Return type
self
- get_params(deep=True)[source]#
Get the parameters of an estimator from the ensemble.
Returns the parameters given in the constructor as well as the estimators contained within the estimators parameter.
- Parameters
deep (bool, default=True) – Setting it to True gets the various estimators and the parameters of the estimators as well.
- Returns
params – Parameter and estimator names mapped to their values or parameter names mapped to their values.
- Return type
dict
- predict(X)#
Perform prediction.
Only available of the meta estimator has a predict method.
- Parameters
X (array-like, shape = (n_samples, n_features)) – Data with samples to predict.
- Returns
prediction – Prediction of meta estimator that combines predictions of base estimators. n_dim depends on the return value of meta estimator’s predict method.
- Return type
array, shape = (n_samples, n_dim)
- predict_log_proba(X)#
Perform prediction.
Only available of the meta estimator has a predict_log_proba method.
- Parameters
X (array-like, shape = (n_samples, n_features)) – Data with samples to predict.
- Returns
prediction – Prediction of meta estimator that combines predictions of base estimators. n_dim depends on the return value of meta estimator’s predict method.
- Return type
ndarray, shape = (n_samples, n_dim)
- predict_proba(X)#
Perform prediction.
Only available of the meta estimator has a predict_proba method.
- Parameters
X (array-like, shape = (n_samples, n_features)) – Data with samples to predict.
- Returns
prediction – Prediction of meta estimator that combines predictions of base estimators. n_dim depends on the return value of meta estimator’s predict method.
- Return type
ndarray, shape = (n_samples, n_dim)
- score(X, y)[source]#
Returns the concordance index of the prediction.
- Parameters
X (array-like, shape = (n_samples, n_features)) – Test samples.
y (structured array, shape = (n_samples,)) – A structured array containing the binary event indicator as first field, and time of event or time of censoring as second field.
- Returns
cindex – Estimated concordance index.
- Return type
float
- set_params(**params)[source]#
Set the parameters of an estimator from the ensemble.
Valid parameter keys can be listed with get_params(). Note that you can directly set the parameters of the estimators contained in estimators.
- Parameters
**params (keyword arguments) – Specific parameters using e.g. set_params(parameter_name=new_value). In addition, to setting the parameters of the estimator, the individual estimator of the estimators can also be set, or can be removed by setting them to ‘drop’.
- Returns
self – Estimator instance.
- Return type
object