Survey#

class tangles.convenience.Survey(data: DataFrame)#

Objects of this class represent survey data and provide functions to prepare, clean and subset survey data.

This class manages a pandas dataframe and a data structure containing information about the variables. It makes sure that the information in both of these objects stays synchronized.

Parameters#

datapandas.DataFrame

A dataframe containing the survey data.

Properties

num_questions

Number of questions (or variables)

num_respondents

Number of respondents

shape

The shape of the data

Methods

__getitem__()

Retrieve a data element/slice/subset/

check_variables()

Check if we can conveniently use the information in this survey for a tangle analysis

complete_rows()

Find out which rows are complete

copy()

Create a copy of this Survey

count_number_of_unique_answers()

Count the number of unique answers for selected columns

count_valid_answers_per_respondent()

Count the number of valid answers for each respondent

guess_variable_types()

Guess missing variable types from data

guess_variable_value_lists()

Guess missing variable value lists from data

interpret_column_selection()

Interpret different ways to select a subset of columns (or variables)

load()

Load a Survey from files

replace_variable_value_labels()

Replace variable value labels according to the dictionary mapping

replace_variable_values()

Replace the values found in the selected columns by different values

save()

Save this survey to a folder:

select_questions()

A new survey containing a subset of the columns

select_respondents()

A new survey containing a subset of the rows

set_pyreadstat_meta_data()

Use metadata returned from the python package pyreadstat to set properties of the variables (or questions)

set_valid_and_invalid_variable_values()

Set the dictionaries of valid and invalid values (and their labels) for selected variables

set_variable_labels()

Set the labels (which often is the question text) of selected variables

set_variable_names()

Replace the names of selected variables

set_variable_types()

Set the type of the variables specified by column_selection

summarize()

Create a summary of some interesting aspects of this survey

variable_info()

A pandas dataframe containing information about the variables (or questions)