feature_factory_all_splits_ge#

tangles.convenience.feature_factory_all_splits_ge(single_col_data: Series | ndarray, invalid_values: list | ndarray | None = None) Tuple[ndarray, ndarray]#

A feature factory function splitting the range of a variable into two subsets using threholds at each unique value of the variable’s range. Recommended for ordinal variables.

The function creates one feature for every unique value the variable can take except the smallest.

Each feature describes the respondents who gave an answer at least as high as the unique value used as threshold.

Parameters#

single_col_datapd.Series or np.ndarray

The featured data.

invalid_valueslist or np.ndarray

The invalid values in single_col_data.

Returns#

tuple[np.ndarray, np.ndarray]

The features in the first entry and the corresponding metadata in the second entry.