vivainsights.identify_outlier

Identify outlier weeks using z-scores for a selected metric.

There are applications in this for identifying weeks with abnormally low collaboration activity, e.g. holidays. Time as a grouping variable can be overridden with the group_var argument.

vivainsights.identify_outlier.identify_outlier(data, group_var='MetricDate', metric='Collaboration_hours')[source]

Identify outlier weeks using z-scores for a metric.

Computes the mean of the metric per group (default: MetricDate) and the corresponding z-scores to flag outliers. Useful for detecting weeks with abnormally low collaboration, e.g. holidays.

Parameters:
  • data (pandas.DataFrame) – Person query data.

  • group_var (str, default "MetricDate") – Grouping variable.

  • metric (str, default "Collaboration_hours") – Name of the metric column.

Returns:

A DataFrame indexed by group_var with the metric mean and a zscore column.

Return type:

pandas.DataFrame

Examples

Detect outlier groups using the default grouping variable:

>>> import vivainsights as vi
>>> pq_data = vi.load_pq_data()
>>> vi.identify_outlier(pq_data, metric="Collaboration_hours")

Specify a custom grouping variable:

>>> vi.identify_outlier(pq_data, metric="Collaboration_hours", group_var="LevelDesignation")