create_features#
- SurveyFeatureFactoryDecorator.create_features(column_selection: str | int | list | ndarray | Series | range | Callable[[SurveyVariable], bool] | None = None) Tuple[ndarray, ndarray]#
Create features for a set of variables corresponding to the columns specified by column_selection.
Parameters#
- column_selectionstr, int, list, np.ndarray, pd.Series, range, QuestionSelector or None
If None, all columns are taken into account. Otherwise, a subset will be considered. The parameter is interpreted as described in
Survey.interpret_column_selection().
Returns#
- tuple[np.ndarray, np.ndarray]
In the first entry, a matrix of shape
(len(col_data), num_features), that contains the features as (oriented) indicator vectors. Here,num_featuresis the number of features created by this method.In the second entry, a matrix of shape
(num_features,), that contains metadata for each feature.In this matrix, every entry is a tuple
(column_name, operation, value). For example, the tuple('feature_name', '>=', 8)describes the feature (or separation) ‘feature_name’ that splits the column into a group that answered less than 8 and one that answered at least 8.