sksurv.io.writearff#
- sksurv.io.writearff(data, filename, relation_name=None, index=True)[source]#
Write ARFF file
- Parameters:
data (
pandas.DataFrameorpolars.DataFrame) – Polars input is converted to pandas internally;pl.Enumcolumns keep their declared categories (incl. unseen labels) in the ARFF header.filename (str or file-like object) – Path to ARFF file or file-like object. In the latter case, the handle is closed by calling this function.
relation_name (str, optional, default: 'pandas') – Name of relation in ARFF file.
index (boolean, optional, default: True) – Write row names (index). Only relevant for pandas input; other dataframe libraries have no row-index concept, so the value is ignored.
See also
loadarffFunction to read ARFF files.
Examples
>>> import tempfile >>> from pathlib import Path >>> import numpy as np >>> import pandas as pd >>> from sksurv.io import writearff >>> >>> # Create a dummy DataFrame >>> data = pd.DataFrame({ ... 'feature1': [1.0, 3.0, 5.0], ... 'feature2': [2.0, np.nan, 6.0], ... 'class': ['A', 'B', 'C'] ... }, index=['One', 'Two', 'Three']) >>> >>> # Write to a temporary directory so the CWD stays clean. >>> with tempfile.TemporaryDirectory() as tmpdir: ... path = Path(tmpdir) / "data.arff" ... writearff(data, str(path), relation_name='test_data') ... print(path.read_text()) @relation test_data @attribute index {One,Three,Two} @attribute feature1 real @attribute feature2 real @attribute class {A,B,C} @data One,1.0,2.0,A Two,3.0,?,B Three,5.0,6.0,C
Polars input is accepted as well.
pl.Enumcolumns preserve their declared category list (including labels absent from the data) in the resulting ARFF header.>>> import polars as pl >>> data_pl = pl.DataFrame({ ... 'feature1': [1.0, 3.0, 5.0], ... 'class': pl.Series(['A', 'B', 'C'], dtype=pl.Enum(['A', 'B', 'C'])), ... }) >>> with tempfile.TemporaryDirectory() as tmpdir: ... path = Path(tmpdir) / "data.arff" ... writearff(data_pl, str(path), relation_name='test_data')