sksurv.preprocessing.OneHotEncoder

class sksurv.preprocessing.OneHotEncoder(allow_drop=True)[source]

Encode categorical columns with M categories into M-1 columns according to the one-hot scheme.

The order of non-categorical columns is preserved, encoded columns are inserted inplace of the original column.

Parameters:allow_drop (boolean, optional, default: True) – Whether to allow dropping categorical columns that only consist of a single category.
feature_names_

List of encoded columns.

Type:pandas.Index
categories_

Categories of encoded columns.

Type:dict
encoded_columns_

Name of columns after encoding. Includes names of non-categorical columns.

Type:list
__init__(allow_drop=True)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__([allow_drop]) Initialize self.
fit(X[, y]) Retrieve categorical columns.
fit_transform(X[, y]) Convert categorical columns to numeric values.
transform(X) Convert categorical columns to numeric values.
fit(X, y=None)[source]

Retrieve categorical columns.

Parameters:
  • X (pandas.DataFrame) – Data to encode.
  • y – Ignored. For compatibility with Pipeline.
Returns:

self – Returns self

Return type:

object

fit_transform(X, y=None, **fit_params)[source]

Convert categorical columns to numeric values.

Parameters:
  • X (pandas.DataFrame) – Data to encode.
  • y – Ignored. For compatibility with TransformerMixin.
  • fit_params – Ignored. For compatibility with TransformerMixin.
Returns:

Xt – Encoded data.

Return type:

pandas.DataFrame

transform(X)[source]

Convert categorical columns to numeric values.

Parameters:X (pandas.DataFrame) – Data to encode.
Returns:Xt – Encoded data.
Return type:pandas.DataFrame