sksurv.kernels.clinical_kernel#

sksurv.kernels.clinical_kernel(x, y=None, *, ordinal_categories=None)[source]#

Computes clinical kernel.

The clinical kernel distinguishes between continuous ordinal, and nominal variables. Kernel values are normalized to lie within [0, 1].

See [1] for further description.

Parameters:
  • x (pandas.DataFrame or polars.DataFrame, shape = (n_samples_x, n_features)) – Training data. Polars and pandas inputs must not be mixed between x and y.

  • y (pandas.DataFrame or polars.DataFrame, shape = (n_samples_y, n_features)) – Testing data. Must use the same dataframe library as x.

  • ordinal_categories (mapping of str to sequence of labels, optional) – Columns to treat as ordinal, mapped to their category order, e.g. {"stage": ["I", "II", "III", "IV"]}. Backend-independent. pandas Categorical(ordered=True) columns are additionally auto-detected.

Returns:

kernel – Kernel matrix.

Return type:

array, shape = (n_samples_x, n_samples_y)

References

Examples

Pandas input. Ordinal columns use the category order from pd.Categorical(ordered=True).

>>> import pandas as pd
>>> from sksurv.kernels import clinical_kernel
>>>
>>> data = pd.DataFrame({
...     'feature_num': [1.0, 2.0, 3.0],
...     'feature_ord': pd.Categorical(['low', 'medium', 'high'], ordered=True),
...     'feature_nom': pd.Categorical(['A', 'B', 'A'])
... })
>>>
>>> kernel_matrix = clinical_kernel(data)
>>> print(kernel_matrix)
[[1.         0.33333333 0.5       ]
 [0.33333333 1.         0.16666667]
 [0.5        0.16666667 1.        ]]