create_features#
- SurveyFeatureFactory.create_features(column_selection: str | int | list | ndarray | Series | range | Callable[[SurveyVariable], bool] | None = None) Tuple[ndarray, ndarray] #
Create features for a set of variables corresponding to the columns specified by column_selection.
Parameters#
- column_selectionstr, int, list, np.ndarray, pd.Series, range, QuestionSelector or None
If None, all columns are taken into account. Otherwise, a subset will be considered. The parameter is interpreted as described in
Survey.interpret_column_selection()
.
Returns#
- tuple[np.ndarray, np.ndarray]
In the first entry, a matrix of shape
(len(col_data), num_features)
, that contains the features as (oriented) indicator vectors. Here,num_features
is the number of features created by this method.In the second entry, a matrix of shape
(num_features,)
, that contains metadata for each feature.In this matrix, every entry is a tuple
(column_name, operation, value)
. For example, the tuple('feature_name', '>=', 8)
describes the feature (or separation) ‘feature_name’ that splits the column into a group that answered less than 8 and one that answered at least 8.