vivainsights.identify_holidayweeks

Detect holiday weeks by scanning for anomalous collaboration hours.

Returns a list of weeks that appear to be holiday weeks and optionally an edited dataframe with outliers removed. By default, missing values are excluded.

vivainsights.identify_holidayweeks.identify_holidayweeks(data, sd=1, return_type='text', figsize=None)[source]

Detect holiday weeks by scanning for anomalous collaboration hours.

Scans a standard person query for weeks where collaboration hours fall far below the mean. As best practice, run this function before other analyses to remove atypical weeks from the dataset.

Parameters:
  • data (pandas.DataFrame) – Person query data. Must contain MetricDate and Collaboration_hours.

  • sd (int or float, default 1) – Number of standard deviations below the mean to flag as an outlier. Enter a positive number.

  • return_type (str, default "text") – One of "text", "labelled_data" / "dirty_data" / "data_dirty", "cleaned_data" / "data_cleaned", "holidayweeks_data", or "plot".

  • figsize (tuple or None, default None) – Figure size (width, height) in inches. Defaults to (8, 6).

Returns:

A diagnostic string, a filtered dataset, or a line chart depending on return_type.

Return type:

str, pandas.DataFrame, or matplotlib.figure.Figure

Examples

Return a text summary of detected holiday weeks:

>>> import vivainsights as vi
>>> pq_data = vi.load_pq_data()
>>> vi.identify_holidayweeks(pq_data, sd=0.75, return_type="text")

Return a line chart highlighting holiday weeks:

>>> vi.identify_holidayweeks(pq_data, sd=0.75, return_type="plot")

Return a cleaned dataset with holiday weeks removed:

>>> vi.identify_holidayweeks(pq_data, sd=0.75, return_type="cleaned_data")

Return the dataset with holiday weeks labelled:

>>> vi.identify_holidayweeks(pq_data, sd=0.75, return_type="labelled_data")