This release adds support for scikit-learn 0.24 and Python 3.9. scikit-survival now requires at least pandas 0.25 and scikit-learn 0.24. Moreover, if sksurv.ensemble.GradientBoostingSurvivalAnalysis. or sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis are fit with loss='coxph', predict_cumulative_hazard_function and predict_survival_function are now available. sksurv.metrics.cumulative_dynamic_auc() now supports evaluating time-dependent predictions, for instance for a sksurv.ensemble.RandomSurvivalForest as illustrated in the User Guide.
sksurv.ensemble.GradientBoostingSurvivalAnalysis
sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis
loss='coxph'
sksurv.metrics.cumulative_dynamic_auc()
sksurv.ensemble.RandomSurvivalForest
Allow passing pandas data frames to all fit and predict methods (#148).
fit
predict
Allow sparse matrices to be passed to sksurv.ensemble.GradientBoostingSurvivalAnalysis.predict().
sksurv.ensemble.GradientBoostingSurvivalAnalysis.predict()
Fix example in user guide using GridSearchCV to determine alphas for CoxnetSurvivalAnalysis (#186).
Add score method to sksurv.meta.Stacking, sksurv.meta.EnsembleSelection, and sksurv.meta.EnsembleSelectionRegressor (#151).
sksurv.meta.Stacking
sksurv.meta.EnsembleSelection
sksurv.meta.EnsembleSelectionRegressor
Add support for predict_cumulative_hazard_function and predict_survival_function to sksurv.ensemble.GradientBoostingSurvivalAnalysis. and sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis if model was fit with loss='coxph'.
Add support for time-dependent predictions to sksurv.metrics.cumulative_dynamic_auc() See the User Guide for an example (#134).
The score method of sksurv.linear_model.IPCRidge, sksurv.svm.FastSurvivalSVM, and sksurv.svm.FastKernelSurvivalSVM (if rank_ratio is smaller than 1) now converts predictions on log(time) scale to risk scores prior to computing the concordance index.
sksurv.linear_model.IPCRidge
sksurv.svm.FastSurvivalSVM
sksurv.svm.FastKernelSurvivalSVM
rank_ratio
Support for cvxpy and cvxopt solver in sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM has been dropped. The default solver is now ECOS, which was used by cvxpy (the previous default) internally. Therefore, results should be identical.
sksurv.svm.MinlipSurvivalAnalysis
sksurv.svm.HingeLossSurvivalSVM
Dropped the presort argument from sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis.
presort
sksurv.tree.SurvivalTree
The X_idx_sorted argument in sksurv.tree.SurvivalTree.fit() has been deprecated in scikit-learn 0.24 and has no effect now.
X_idx_sorted
sksurv.tree.SurvivalTree.fit()
predict_cumulative_hazard_function and predict_survival_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree now return an array of sksurv.functions.StepFunction objects by default. Use return_array=True to get the old behavior.
sksurv.functions.StepFunction
return_array=True
Support for Python 3.6 has been dropped.
Increase minimum supported versions of dependencies. We now require:
Package Minimum Version Pandas 0.25.0 scikit-learn 0.24.0
Package
Minimum Version
Pandas
0.25.0
scikit-learn
0.24.0
This release features a complete overhaul of the documentation. It features a new visual design, and the inclusion of several interactive notebooks in the User Guide.
In addition, it includes important bug fixes. It fixes several bugs in sksurv.linear_model.CoxnetSurvivalAnalysis where predict, predict_survival_function, and predict_cumulative_hazard_function returned wrong values if features of the training data were not centered. Moreover, the score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis will now correctly compute the concordance index if loss='ipcwls' or loss='squared'.
sksurv.linear_model.CoxnetSurvivalAnalysis
predict_survival_function
predict_cumulative_hazard_function
loss='ipcwls'
loss='squared'
sksurv.column.standardize() modified data in-place. Data is now always copied.
sksurv.column.standardize()
sksurv.column.standardize() works with integer numpy arrays now.
sksurv.column.standardize() used biased standard deviation for numpy arrays (ddof=0), but unbiased standard deviation for pandas objects (ddof=1). It always uses ddof=1 now. Therefore, the output, if the input is a numpy array, will differ from that of previous versions.
ddof=0
ddof=1
Fixed sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function() and sksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function(), which returned wrong values if features of training data were not already centered. This adds an offset_ attribute that accounts for non-centered data and is added to the predicted risk score. Therefore, the outputs of predict, predict_survival_function, and predict_cumulative_hazard_function will be different to previous versions for non-centered data (#139).
sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function()
sksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function()
offset_
Rescale coefficients of sksurv.linear_model.CoxnetSurvivalAnalysis if normalize=True.
Fix score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis if loss='ipcwls' or loss='squared' is used. Previously, it returned 1.0 - true_cindex.
1.0 - true_cindex
Add sksurv.show_versions() that prints the version of all dependencies.
sksurv.show_versions()
Add support for pandas 1.1
Include interactive notebooks in documentation on readthedocs.
Add user guide on penalized Cox models.
Add user guide on gradient boosted models.
This release fixes warnings that were introduced with 0.13.0.
Explicitly pass return_array=True in sksurv.tree.SurvivalTree.predict() to avoid FutureWarning.
sksurv.tree.SurvivalTree.predict()
Fix error when fitting sksurv.tree.SurvivalTree with non-float dtype for time (#127).
Fix RuntimeWarning: invalid value encountered in true_divide in sksurv.nonparametric.kaplan_meier_estimator().
sksurv.nonparametric.kaplan_meier_estimator()
Fix PendingDeprecationWarning about use of matrix when fitting sksurv.svm.FastSurvivalSVM if optimizer is PRSVM or simple.
The highlights of this release include the addition of sksurv.metrics.brier_score() and sksurv.metrics.integrated_brier_score() and compatibility with scikit-learn 0.23.
sksurv.metrics.brier_score()
sksurv.metrics.integrated_brier_score()
predict_survival_function and predict_cumulative_hazard_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree can now return an array of sksurv.functions.StepFunction, similar to sksurv.linear_model.CoxPHSurvivalAnalysis by specifying return_array=False. This will be the default behavior starting with 0.14.0.
sksurv.linear_model.CoxPHSurvivalAnalysis
return_array=False
Note that this release fixes a bug in estimating inverse probability of censoring weights (IPCW), which will affect all estimators relying on IPCW.
Make build system compatible with PEP-517/518.
Added sksurv.metrics.brier_score() and sksurv.metrics.integrated_brier_score() (#101).
sksurv.functions.StepFunction can now be evaluated at multiple points in a single call.
Update documentation on usage of predict_survival_function and predict_cumulative_hazard_function (#118).
The default value of alpha_min_ratio of sksurv.linear_model.CoxnetSurvivalAnalysis will now depend on the n_samples/n_features ratio. If n_samples > n_features, the default value is 0.0001 If n_samples <= n_features, the default value is 0.01.
n_samples > n_features
n_samples <= n_features
Add support for scikit-learn 0.23 (#119).
predict_survival_function and predict_cumulative_hazard_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree will return an array of sksurv.functions.StepFunction in the future (as sksurv.linear_model.CoxPHSurvivalAnalysis does). For the old behavior, use return_array=True.
Fix deprecation of importing joblib via sklearn.
Fix estimation of censoring distribution for tied times with events. When estimating the censoring distribution, by specifying reverse=True when calling sksurv.nonparametric.kaplan_meier_estimator(), we now consider events to occur before censoring. For tied time points with an event, those with an event are not considered at risk anymore and subtracted from the denominator of the Kaplan-Meier estimator. The change affects all functions relying on inverse probability of censoring weights, namely:
reverse=True
sksurv.nonparametric.CensoringDistributionEstimator
sksurv.nonparametric.ipc_weights()
sksurv.metrics.concordance_index_ipcw()
Throw an exception when trying to estimate c-index from uncomparable data (#117).
Estimators in sksurv.svm will now throw an exception when trying to fit a model to data with uncomparable pairs.
sksurv.svm
This release adds support for scikit-learn 0.22, thereby dropping support for older versions. Moreover, the regularization strength of the ridge penalty in sksurv.linear_model.CoxPHSurvivalAnalysis can now be set per feature. If you want one or more features to enter the model unpenalized, set the corresponding penalty weights to zero. Finally, sklearn.pipeline.Pipeline will now be automatically patched to add support for predict_cumulative_hazard_function and predict_survival_function if the underlying estimator supports it.
sklearn.pipeline.Pipeline
Add scikit-learn’s deprecation of presort in sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis.
Add warning that default alpha_min_ratio in sksurv.linear_model.CoxnetSurvivalAnalysis will depend on the ratio of the number of samples to the number of features in the future (#41).
Add references to API doc of sksurv.ensemble.GradientBoostingSurvivalAnalysis (#91).
Add support for pandas 1.0 (#100).
Add ccp_alpha parameter for Minimal Cost-Complexity Pruning to sksurv.ensemble.GradientBoostingSurvivalAnalysis.
Patch sklearn.pipeline.Pipeline to add support for predict_cumulative_hazard_function and predict_survival_function if the underlying estimator supports it.
Allow per-feature regularization for sksurv.linear_model.CoxPHSurvivalAnalysis (#102).
Clarify API docs of sksurv.metrics.concordance_index_censored() (#96).
sksurv.metrics.concordance_index_censored()
This release adds sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest, which are based on the log-rank split criterion. It also adds the OSQP solver as option to sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM, which will replace the now deprecated cvxpy and cvxopt options in a future release.
This release removes support for sklearn 0.20 and requires sklearn 0.21.
The cvxpy and cvxopt options for solver in sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM are deprecated and will be removed in a future version. Choosing osqp is the preferred option now.
Add support for pandas 0.25.
Add OSQP solver option to sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM which has no additional dependencies.
Fix issue when using cvxpy 1.0.16 or later.
Explicitly specify utf-8 encoding when reading README.rst (#89).
Add sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest (#90).
Exclude Cython-generated files from source distribution because they are not forward compatible.
This release adds the ties argument to sksurv.linear_model.CoxPHSurvivalAnalysis to choose between Breslow’s and Efron’s likelihood in the presence of tied event times. Moreover, sksurv.compare.compare_survival() has been added, which implements the log-rank hypothesis test for comparing the survival function of 2 or more groups.
sksurv.compare.compare_survival()
Update API doc of predict function of boosting estimators (#75).
Clarify documentation for GradientBoostingSurvivalAnalysis (#78).
Implement Efron’s likelihood for handling tied event times.
Implement log-rank test for comparing survival curves.
Add support for scipy 1.3.1 (#66).
Re-add baseline_survival_ and cum_baseline_hazard_ attributes to sksurv.linear_model.CoxPHSurvivalAnalysis (#76).
This release adds support for sklearn 0.21 and pandas 0.24.
Add reference to IPCRidge (#65).
Use scipy.special.comb instead of deprecated scipy.misc.comb.
Add support for pandas 0.24 and drop support for 0.20.
Add support for scikit-learn 0.21 and drop support for 0.20 (#71).
Explain use of intercept in ComponentwiseGradientBoostingSurvivalAnalysis (#68)
Bump Eigen to 3.3.7.
Disallow scipy 1.3.0 due to scipy regression (#66).
Add sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function() and sksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function() (#46).
Add sksurv.nonparametric.SurvivalFunctionEstimator and sksurv.nonparametric.CensoringDistributionEstimator that wrap sksurv.nonparametric.kaplan_meier_estimator() and provide a predict_proba method for evaluating the estimated function on test data.
sksurv.nonparametric.SurvivalFunctionEstimator
Implement censoring-adjusted C-statistic proposed by Uno et al. (2011) in sksurv.metrics.concordance_index_ipcw().
Add estimator of cumulative/dynamic AUC of Uno et al. (2007) in sksurv.metrics.cumulative_dynamic_auc().
Add flchain dataset (see sksurv.datasets.load_flchain()).
sksurv.datasets.load_flchain()
The tied_time return value of sksurv.metrics.concordance_index_censored() now correctly reflects the number of comparable pairs that share the same time and that are used in computing the concordance index.
Fix a bug in sksurv.metrics.concordance_index_censored() where a pair with risk estimates within tolerance was counted both as concordant and tied.
This release adds support for Python 3.7 and sklearn 0.20.
Changes:
Add support for sklearn 0.20 (#48).
Migrate to py.test (#50).
Explicitly request ECOS solver for sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM.
Add support for Python 3.7 (#49).
Add support for cvxpy >=1.0.
Add support for numpy 1.15.
This release adds support for numpy 1.14 and pandas up to 0.23. In addition, the new class sksurv.util.Surv makes it easier to construct a structured array from numpy arrays, lists, or a pandas data frame.
sksurv.util.Surv
Support numpy 1.14 and pandas 0.22, 0.23 (#36).
Enable support for cvxopt with Python 3.5+ on Windows (requires cvxopt >=1.1.9).
Add max_iter parameter to sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM.
Fix score function of sksurv.svm.NaiveSurvivalSVM to use concordance index.
sksurv.svm.NaiveSurvivalSVM
sksurv.linear_model.CoxnetSurvivalAnalysis now throws an exception if coefficients get too large (#47).
Add sksurv.util.Surv class to ease constructing a structured array (#26).
This release adds support for scikit-learn 0.19 and pandas 0.21. In turn, support for older versions is dropped, namely Python 3.4, scikit-learn 0.18, and pandas 0.18.
This release adds sksurv.linear_model.CoxnetSurvivalAnalysis, which implements an efficient algorithm to fit Cox’s proportional hazards model with LASSO, ridge, and elastic net penalty. Moreover, it includes support for Windows with Python 3.5 and later by making the cvxopt package optional.
This release adds sksurv.linear_model.CoxPHSurvivalAnalysis.predict_survival_function() and sksurv.linear_model.CoxPHSurvivalAnalysis.predict_cumulative_hazard_function(), which return the survival function and cumulative hazard function using Breslow’s estimator. Moreover, it fixes a build error on Windows (gh #3) and adds the sksurv.preprocessing.OneHotEncoder class, which can be used in a scikit-learn pipeline.
sksurv.linear_model.CoxPHSurvivalAnalysis.predict_survival_function()
sksurv.linear_model.CoxPHSurvivalAnalysis.predict_cumulative_hazard_function()
sksurv.preprocessing.OneHotEncoder
This release adds support for Python 3.6, and pandas 0.19 and 0.20.
This is the initial release of scikit-survival. It combines the implementation of survival support vector machines with the code used in the Prostate Cancer DREAM challenge.