create_features_for_single_col#
- SimpleSurveyFeatureFactoryMissingValuesBothSides.create_features_for_single_col(variable: SurveyVariable, col_data: Series) Tuple[ndarray, ndarray] #
Create a set of binary features given a variable and a column of the data containing all answers to the question corresponding to variable.
A factory method, to be overwritten by sub classes.
Parameters#
- variable
SurveyVariable
A survey variable.
- col_datapandas.Series
The respondents’ answers to the question corresponding to variable.
Returns#
- tuple[np.ndarray, np.ndarray]
In the first entry, a matrix of shape
(len(col_data), num_features)
, that contains the features as (oriented) indicator vectors. Here,num_features
is the number of features created by this method.In the second entry, a matrix of shape
(num_features,)
, that contains metadata for each feature.The metadata of each feature is expected to be a tuple
(operation, value)
. For example, the tuple('>=', 8)
describes a feature (or separation) that splits the column into a group that answered less than 8 and one that answered at least 8.
- variable