create_features_for_single_col#

SimpleSurveyFeatureFactoryMissingValuesOwnFeatures.create_features_for_single_col(variable: SurveyVariable, col_data: Series) Tuple[ndarray, ndarray]#

Create a set of binary features given a variable and a column of the data containing all answers to the question corresponding to variable.

A factory method, to be overwritten by sub classes.

Parameters#

variableSurveyVariable

A survey variable.

col_datapandas.Series

The respondents’ answers to the question corresponding to variable.

Returns#

tuple[np.ndarray, np.ndarray]

In the first entry, a matrix of shape (len(col_data), num_features), that contains the features as (oriented) indicator vectors. Here, num_features is the number of features created by this method.

In the second entry, a matrix of shape (num_features,), that contains metadata for each feature.

The metadata of each feature is expected to be a tuple (operation, value). For example, the tuple ('>=', 8) describes a feature (or separation) that splits the column into a group that answered less than 8 and one that answered at least 8.