vivainsights.extract_hr

Extract HR or organizational attribute columns from a Viva Insights dataset.

There is an option to return either just a list of the variable names or a DataFrame containing only the variables themselves.

vivainsights.extract_hr.extract_hr(data, max_unique=50, exclude_constants=True, return_type='names')[source]

Extract HR attributes (organizational data) by detecting variable class and number of unique values.

Parameters:
  • data (pandas.DataFrame) – Data from which to extract HR variables.

  • max_unique (int) – Maximum number of unique values a column can have to be included. Defaults to 50.

  • exclude_constants (bool) – Whether to exclude columns with only one unique value. Defaults to True.

  • return_type (str) – Output type. "names" (default) prints column names, "vars" returns the filtered DataFrame, "suggestion" returns a list of column names.

Returns:

Depends on return_type: a DataFrame of HR columns, a list of column names, or prints names to console.

Return type:

pandas.DataFrame, list of str, or None

Examples

Print HR variable names to console (default):

>>> import vivainsights as vi
>>> pq_data = vi.load_pq_data()
>>> vi.extract_hr(data=pq_data)

Return the HR columns as a filtered DataFrame:

>>> vi.extract_hr(data=pq_data, return_type="vars")

Return a list of suggested HR column names:

>>> vi.extract_hr(data=pq_data, return_type="suggestion")

Adjust the maximum unique values threshold:

>>> vi.extract_hr(data=pq_data, max_unique=50, return_type="names")