Release Notes#
scikit-survival 0.20.0 (2023-03-05)#
This release adds support for scikit-learn 1.2 and drops support for previous versions.
Enhancements#
Raise more informative error messages when a parameter does not have a valid type/value (see sklearn#23462).
Add
positive
andrandom_state
parameters tosksurv.linear_model.IPCRidge
.
Documentation#
Update API docs based on scikit-learn 1.2 (where applicable).
Backwards incompatible changes#
To align with the scikit-learn API, many parameters of estimators must be provided with their names, as keyword arguments, instead of positional arguments.
Remove deprecated
normalize
parameter fromsksurv.linear_model.IPCRidge
.Remove deprecated
X_idx_sorted
argument fromsksurv.tree.SurvivalTree.fit()
.Setting
kernel="polynomial"
insksurv.svm.FastKernelSurvivalSVM
,sksurv.svm.HingeLossSurvivalSVM
, andsksurv.svm.MinlipSurvivalAnalysis
has been replaced withkernel="poly"
.
scikit-survival 0.19.0 (2022-10-23)#
This release adds sksurv.tree.SurvivalTree.apply()
and
sksurv.tree.SurvivalTree.decision_path()
, and support
for sparse matrices to sksurv.tree.SurvivalTree
.
Moreover, it fixes build issues with scikit-learn 1.1.2
and on macOS with ARM64 CPU.
Bug fixes#
Fix build issue with scikit-learn 1.1.2, which is binary-incompatible with previous releases from the 1.1 series.
Fix build from source on macOS with ARM64 by specifying numpy 1.21.0 as install requirement for that platform (#313).
Enhancements#
sksurv.tree.SurvivalTree
: Addsksurv.tree.SurvivalTree.apply()
andsksurv.tree.SurvivalTree.decision_path()
(#290).sksurv.tree.SurvivalTree
: Add support for sparse matrices (#290).
scikit-survival 0.18.0 (2022-08-15)#
This release adds support for scikit-learn 1.1, which includes more informative error messages. Support for Python 3.7 has been dropped, and the minimum supported versions of dependencies are updated to
Package
Minimum Version
numpy
1.17.3
Pandas
1.0.5
scikit-learn
1.1.0
scipy
1.3.2
Enhancements#
Add
n_iter_
attribute to all estimators in sksurv.svm (#277).Add
return_array
argument to all models providingpredict_survival_function
andpredict_cumulative_hazard_function
(#268).
Deprecations#
The
loss_
attribute ofsksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis
andsksurv.ensemble.GradientBoostingSurvivalAnalysis
has been deprecated.The default for the
max_features
argument has been changed from'auto'
to'sqrt'
forsksurv.ensemble.RandomSurvivalForest
andsksurv.ensemble.ExtraSurvivalTrees
.'auto'
and'sqrt'
have the same effect.
scikit-survival 0.17.2 (2022-04-24)#
This release fixes several issues with packaging scikit-survival.
Bug fixes#
Added backward support for gcc-c++ (#255).
Do not install C/C++ and Cython source files.
Add
packaging
to build requirements inpyproject.toml
.Exclude generated API docs from source distribution.
Add Python 3.10 to classifiers.
Documentation#
Use permutation_importance from sklearn instead of eli5.
Build documentation with Sphinx 4.4.0.
Fix missing documentation for classes in
sksurv.meta
.
scikit-survival 0.17.1 (2022-03-05)#
This release adds support for Python 3.10.
scikit-survival 0.17.0 (2022-01-09)#
This release adds support for scikit-learn 1.0, which includes
support for feature names.
If you pass a pandas dataframe to fit
, the estimator will
set a feature_names_in_ attribute containing the feature names.
When a dataframe is passed to predict
, it is checked that the
column names are consistent with those passed to fit
. See the
scikit-learn release highlights
for details.
Bug fixes#
Fix a variety of build problems with LLVM (#243).
Enhancements#
Add support for
feature_names_in_
andn_features_in_
to all estimators and transforms.Add
sksurv.preprocessing.OneHotEncoder.get_feature_names_out()
.Update bundled version of Eigen to 3.3.9.
Backwards incompatible changes#
Drop
min_impurity_split
parameter fromsksurv.ensemble.GradientBoostingSurvivalAnalysis
.base_estimators
andmeta_estimator
attributes ofsksurv.meta.Stacking
do not contain fitted models anymore, useestimators_
andfinal_estimator_
, respectively.
Deprecations#
The
normalize
parameter ofsksurv.linear_model.IPCRidge
is deprecated and will be removed in a future version. Instead, use a scikit-learn pipeline:make_pipeline(StandardScaler(with_mean=False), IPCRidge())
.
scikit-survival 0.16.0 (2021-10-30)#
This release adds support for changing the evaluation metric that
is used in estimators’ score
method. This is particular useful
for hyper-parameter optimization using scikit-learn’s GridSearchCV
.
You can now use sksurv.metrics.as_concordance_index_ipcw_scorer
,
sksurv.metrics.as_cumulative_dynamic_auc_scorer
, or
sksurv.metrics.as_integrated_brier_score_scorer
to adjust the
score
method to your needs. A detailed example is available in the
User Guide.
Moreover, this release adds sksurv.ensemble.ExtraSurvivalTrees
to fit an ensemble of randomized survival trees, and improves the speed
of sksurv.compare.compare_survival()
significantly.
The documentation has been extended by a section on
the time-dependent Brier score.
Bug fixes#
Columns are dropped in
sksurv.column.encode_categorical()
despiteallow_drop=False
(#199).Ensure
sksurv.column.categorical_to_numeric()
always returns series with int64 dtype.
Enhancements#
Add
sksurv.ensemble.ExtraSurvivalTrees
ensemble (#195).Faster speed for
sksurv.compare.compare_survival()
(#215).Add wrapper classes
sksurv.metrics.as_concordance_index_ipcw_scorer
,sksurv.metrics.as_cumulative_dynamic_auc_scorer
, andsksurv.metrics.as_integrated_brier_score_scorer
to override the defaultscore
method of estimators (#192).Remove use of deprecated numpy dtypes.
Remove use of
inplace
in pandas’set_categories
.
Documentation#
Remove comments and code suggesting log-transforming times prior to training Survival SVM (#203).
Add documentation for
max_samples
parameter tosksurv.ensemble.ExtraSurvivalTrees
andsksurv.ensemble.RandomSurvivalForest
(#217).Add section on time-dependent Brier score (#220).
Add section on using alternative metrics for hyper-parameter optimization.
scikit-survival 0.15.0 (2021-03-20)#
This release adds support for scikit-learn 0.24 and Python 3.9.
scikit-survival now requires at least pandas 0.25 and scikit-learn 0.24.
Moreover, if sksurv.ensemble.GradientBoostingSurvivalAnalysis
.
or sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis
are fit with loss='coxph'
, predict_cumulative_hazard_function and
predict_survival_function are now available.
sksurv.metrics.cumulative_dynamic_auc()
now supports evaluating
time-dependent predictions, for instance for a sksurv.ensemble.RandomSurvivalForest
as illustrated in the
User Guide.
Bug fixes#
Allow passing pandas data frames to all
fit
andpredict
methods (#148).Allow sparse matrices to be passed to
sksurv.ensemble.GradientBoostingSurvivalAnalysis.predict()
.Fix example in user guide using GridSearchCV to determine alphas for CoxnetSurvivalAnalysis (#186).
Enhancements#
Add score method to
sksurv.meta.Stacking
,sksurv.meta.EnsembleSelection
, andsksurv.meta.EnsembleSelectionRegressor
(#151).Add support for predict_cumulative_hazard_function and predict_survival_function to
sksurv.ensemble.GradientBoostingSurvivalAnalysis
. andsksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis
if model was fit withloss='coxph'
.Add support for time-dependent predictions to
sksurv.metrics.cumulative_dynamic_auc()
See the User Guide for an example (#134).
Backwards incompatible changes#
The score method of
sksurv.linear_model.IPCRidge
,sksurv.svm.FastSurvivalSVM
, andsksurv.svm.FastKernelSurvivalSVM
(ifrank_ratio
is smaller than 1) now converts predictions on log(time) scale to risk scores prior to computing the concordance index.Support for cvxpy and cvxopt solver in
sksurv.svm.MinlipSurvivalAnalysis
andsksurv.svm.HingeLossSurvivalSVM
has been dropped. The default solver is now ECOS, which was used by cvxpy (the previous default) internally. Therefore, results should be identical.Dropped the
presort
argument fromsksurv.tree.SurvivalTree
andsksurv.ensemble.GradientBoostingSurvivalAnalysis
.The
X_idx_sorted
argument insksurv.tree.SurvivalTree.fit()
has been deprecated in scikit-learn 0.24 and has no effect now.predict_cumulative_hazard_function and predict_survival_function of
sksurv.ensemble.RandomSurvivalForest
andsksurv.tree.SurvivalTree
now return an array ofsksurv.functions.StepFunction
objects by default. Usereturn_array=True
to get the old behavior.Support for Python 3.6 has been dropped.
Increase minimum supported versions of dependencies. We now require:
Package
Minimum Version
Pandas
0.25.0
scikit-learn
0.24.0
scikit-survival 0.14.0 (2020-10-07)#
This release features a complete overhaul of the documentation. It features a new visual design, and the inclusion of several interactive notebooks in the User Guide.
In addition, it includes important bug fixes.
It fixes several bugs in sksurv.linear_model.CoxnetSurvivalAnalysis
where predict
, predict_survival_function
, and predict_cumulative_hazard_function
returned wrong values if features of the training data were not centered.
Moreover, the score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis
and sksurv.ensemble.GradientBoostingSurvivalAnalysis
will now
correctly compute the concordance index if loss='ipcwls'
or loss='squared'
.
Bug fixes#
sksurv.column.standardize()
modified data in-place. Data is now always copied.sksurv.column.standardize()
works with integer numpy arrays now.sksurv.column.standardize()
used biased standard deviation for numpy arrays (ddof=0
), but unbiased standard deviation for pandas objects (ddof=1
). It always usesddof=1
now. Therefore, the output, if the input is a numpy array, will differ from that of previous versions.Fixed
sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function()
andsksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function()
, which returned wrong values if features of training data were not already centered. This adds anoffset_
attribute that accounts for non-centered data and is added to the predicted risk score. Therefore, the outputs ofpredict
,predict_survival_function
, andpredict_cumulative_hazard_function
will be different to previous versions for non-centered data (#139).Rescale coefficients of
sksurv.linear_model.CoxnetSurvivalAnalysis
if normalize=True.Fix score function of
sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis
andsksurv.ensemble.GradientBoostingSurvivalAnalysis
ifloss='ipcwls'
orloss='squared'
is used. Previously, it returned1.0 - true_cindex
.
Enhancements#
Add
sksurv.show_versions()
that prints the version of all dependencies.Add support for pandas 1.1
Include interactive notebooks in documentation on readthedocs.
Add user guide on penalized Cox models.
Add user guide on gradient boosted models.
scikit-survival 0.13.1 (2020-07-04)#
This release fixes warnings that were introduced with 0.13.0.
Bug fixes#
Explicitly pass
return_array=True
insksurv.tree.SurvivalTree.predict()
to avoid FutureWarning.Fix error when fitting
sksurv.tree.SurvivalTree
with non-float dtype for time (#127).Fix RuntimeWarning: invalid value encountered in true_divide in
sksurv.nonparametric.kaplan_meier_estimator()
.Fix PendingDeprecationWarning about use of matrix when fitting
sksurv.svm.FastSurvivalSVM
if optimizer is PRSVM or simple.
scikit-survival 0.13.0 (2020-06-28)#
The highlights of this release include the addition of
sksurv.metrics.brier_score()
and
sksurv.metrics.integrated_brier_score()
and compatibility with scikit-learn 0.23.
predict_survival_function and predict_cumulative_hazard_function
of sksurv.ensemble.RandomSurvivalForest
and
sksurv.tree.SurvivalTree
can now return an array of
sksurv.functions.StepFunction
, similar
to sksurv.linear_model.CoxPHSurvivalAnalysis
by specifying return_array=False
. This will be the default
behavior starting with 0.14.0.
Note that this release fixes a bug in estimating inverse probability of censoring weights (IPCW), which will affect all estimators relying on IPCW.
Enhancements#
Make build system compatible with PEP-517/518.
Added
sksurv.metrics.brier_score()
andsksurv.metrics.integrated_brier_score()
(#101).sksurv.functions.StepFunction
can now be evaluated at multiple points in a single call.Update documentation on usage of predict_survival_function and predict_cumulative_hazard_function (#118).
The default value of alpha_min_ratio of
sksurv.linear_model.CoxnetSurvivalAnalysis
will now depend on the n_samples/n_features ratio. Ifn_samples > n_features
, the default value is 0.0001 Ifn_samples <= n_features
, the default value is 0.01.Add support for scikit-learn 0.23 (#119).
Deprecations#
predict_survival_function and predict_cumulative_hazard_function of
sksurv.ensemble.RandomSurvivalForest
andsksurv.tree.SurvivalTree
will return an array ofsksurv.functions.StepFunction
in the future (assksurv.linear_model.CoxPHSurvivalAnalysis
does). For the old behavior, use return_array=True.
Bug fixes#
Fix deprecation of importing joblib via sklearn.
Fix estimation of censoring distribution for tied times with events. When estimating the censoring distribution, by specifying
reverse=True
when callingsksurv.nonparametric.kaplan_meier_estimator()
, we now consider events to occur before censoring. For tied time points with an event, those with an event are not considered at risk anymore and subtracted from the denominator of the Kaplan-Meier estimator. The change affects all functions relying on inverse probability of censoring weights, namely:Throw an exception when trying to estimate c-index from incomparable data (#117).
Estimators in
sksurv.svm
will now throw an exception when trying to fit a model to data with incomparable pairs.
scikit-survival 0.12 (2020-04-15)#
This release adds support for scikit-learn 0.22, thereby dropping support for
older versions. Moreover, the regularization strength of the ridge penalty
in sksurv.linear_model.CoxPHSurvivalAnalysis
can now be set per
feature. If you want one or more features to enter the model unpenalized,
set the corresponding penalty weights to zero.
Finally, sklearn.pipeline.Pipeline
will now be automatically patched
to add support for predict_cumulative_hazard_function and predict_survival_function
if the underlying estimator supports it.
Deprecations#
Add scikit-learn’s deprecation of presort in
sksurv.tree.SurvivalTree
andsksurv.ensemble.GradientBoostingSurvivalAnalysis
.Add warning that default alpha_min_ratio in
sksurv.linear_model.CoxnetSurvivalAnalysis
will depend on the ratio of the number of samples to the number of features in the future (#41).
Enhancements#
Add references to API doc of
sksurv.ensemble.GradientBoostingSurvivalAnalysis
(#91).Add support for pandas 1.0 (#100).
Add ccp_alpha parameter for Minimal Cost-Complexity Pruning to
sksurv.ensemble.GradientBoostingSurvivalAnalysis
.Patch
sklearn.pipeline.Pipeline
to add support for predict_cumulative_hazard_function and predict_survival_function if the underlying estimator supports it.Allow per-feature regularization for
sksurv.linear_model.CoxPHSurvivalAnalysis
(#102).Clarify API docs of
sksurv.metrics.concordance_index_censored()
(#96).
scikit-survival 0.11 (2019-12-21)#
This release adds sksurv.tree.SurvivalTree
and sksurv.ensemble.RandomSurvivalForest
,
which are based on the log-rank split criterion.
It also adds the OSQP solver as option to sksurv.svm.MinlipSurvivalAnalysis
and sksurv.svm.HingeLossSurvivalSVM
, which will replace the now deprecated
cvxpy and cvxopt options in a future release.
This release removes support for sklearn 0.20 and requires sklearn 0.21.
Deprecations#
The cvxpy and cvxopt options for solver in
sksurv.svm.MinlipSurvivalAnalysis
andsksurv.svm.HingeLossSurvivalSVM
are deprecated and will be removed in a future version. Choosing osqp is the preferred option now.
Enhancements#
Add support for pandas 0.25.
Add OSQP solver option to
sksurv.svm.MinlipSurvivalAnalysis
andsksurv.svm.HingeLossSurvivalSVM
which has no additional dependencies.Fix issue when using cvxpy 1.0.16 or later.
Explicitly specify utf-8 encoding when reading README.rst (#89).
Add
sksurv.tree.SurvivalTree
andsksurv.ensemble.RandomSurvivalForest
(#90).
Bug fixes#
Exclude Cython-generated files from source distribution because they are not forward compatible.
scikit-survival 0.10 (2019-09-02)#
This release adds the ties argument to sksurv.linear_model.CoxPHSurvivalAnalysis
to choose between Breslow’s and Efron’s likelihood in the presence of tied event times.
Moreover, sksurv.compare.compare_survival()
has been added, which implements
the log-rank hypothesis test for comparing the survival function of 2 or more groups.
Enhancements#
Update API doc of predict function of boosting estimators (#75).
Clarify documentation for GradientBoostingSurvivalAnalysis (#78).
Implement Efron’s likelihood for handling tied event times.
Implement log-rank test for comparing survival curves.
Add support for scipy 1.3.1 (#66).
Bug fixes#
Re-add baseline_survival_ and cum_baseline_hazard_ attributes to
sksurv.linear_model.CoxPHSurvivalAnalysis
(#76).
scikit-survival 0.9 (2019-07-26)#
This release adds support for sklearn 0.21 and pandas 0.24.
Enhancements#
Add reference to IPCRidge (#65).
Use scipy.special.comb instead of deprecated scipy.misc.comb.
Add support for pandas 0.24 and drop support for 0.20.
Add support for scikit-learn 0.21 and drop support for 0.20 (#71).
Explain use of intercept in ComponentwiseGradientBoostingSurvivalAnalysis (#68)
Bump Eigen to 3.3.7.
Bug fixes#
Disallow scipy 1.3.0 due to scipy regression (#66).
scikit-survival 0.8 (2019-05-01)#
Enhancements#
Add
sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function()
andsksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function()
(#46).Add
sksurv.nonparametric.SurvivalFunctionEstimator
andsksurv.nonparametric.CensoringDistributionEstimator
that wrapsksurv.nonparametric.kaplan_meier_estimator()
and provide a predict_proba method for evaluating the estimated function on test data.Implement censoring-adjusted C-statistic proposed by Uno et al. (2011) in
sksurv.metrics.concordance_index_ipcw()
.Add estimator of cumulative/dynamic AUC of Uno et al. (2007) in
sksurv.metrics.cumulative_dynamic_auc()
.Add flchain dataset (see
sksurv.datasets.load_flchain()
).
Bug fixes#
The tied_time return value of
sksurv.metrics.concordance_index_censored()
now correctly reflects the number of comparable pairs that share the same time and that are used in computing the concordance index.Fix a bug in
sksurv.metrics.concordance_index_censored()
where a pair with risk estimates within tolerance was counted both as concordant and tied.
scikit-survival 0.7 (2019-02-27)#
This release adds support for Python 3.7 and sklearn 0.20.
Changes:
Add support for sklearn 0.20 (#48).
Migrate to py.test (#50).
Explicitly request ECOS solver for
sksurv.svm.MinlipSurvivalAnalysis
andsksurv.svm.HingeLossSurvivalSVM
.Add support for Python 3.7 (#49).
Add support for cvxpy >=1.0.
Add support for numpy 1.15.
scikit-survival 0.6 (2018-10-07)#
This release adds support for numpy 1.14 and pandas up to 0.23.
In addition, the new class sksurv.util.Surv
makes it easier
to construct a structured array from numpy arrays, lists, or a pandas data frame.
Changes:
Support numpy 1.14 and pandas 0.22, 0.23 (#36).
Enable support for cvxopt with Python 3.5+ on Windows (requires cvxopt >=1.1.9).
Add max_iter parameter to
sksurv.svm.MinlipSurvivalAnalysis
andsksurv.svm.HingeLossSurvivalSVM
.Fix score function of
sksurv.svm.NaiveSurvivalSVM
to use concordance index.sksurv.linear_model.CoxnetSurvivalAnalysis
now throws an exception if coefficients get too large (#47).Add
sksurv.util.Surv
class to ease constructing a structured array (#26).
scikit-survival 0.5 (2017-12-09)#
This release adds support for scikit-learn 0.19 and pandas 0.21. In turn, support for older versions is dropped, namely Python 3.4, scikit-learn 0.18, and pandas 0.18.
scikit-survival 0.4 (2017-10-28)#
This release adds sksurv.linear_model.CoxnetSurvivalAnalysis
, which implements
an efficient algorithm to fit Cox’s proportional hazards model with LASSO, ridge, and
elastic net penalty.
Moreover, it includes support for Windows with Python 3.5 and later by making the cvxopt
package optional.
scikit-survival 0.3 (2017-08-01)#
This release adds sksurv.linear_model.CoxPHSurvivalAnalysis.predict_survival_function()
and sksurv.linear_model.CoxPHSurvivalAnalysis.predict_cumulative_hazard_function()
,
which return the survival function and cumulative hazard function using Breslow’s
estimator.
Moreover, it fixes a build error on Windows (#3)
and adds the sksurv.preprocessing.OneHotEncoder
class, which can be used in
a scikit-learn pipeline.
scikit-survival 0.2 (2017-05-29)#
This release adds support for Python 3.6, and pandas 0.19 and 0.20.
scikit-survival 0.1 (2016-12-29)#
This is the initial release of scikit-survival. It combines the implementation of survival support vector machines with the code used in the Prostate Cancer DREAM challenge.