This function takes in a selected metric and uses
z-score (number of standard deviations) to identify outliers
across time. There are applications in this for identifying
weeks with abnormally low collaboration activity, e.g. holidays.
Time as a grouping variable can be overridden with the group_var
argument.
Value
Returns a data frame with MetricDate
(if grouping variable is not set),
the metric, and the corresponding z-score.
See also
Other Data Validation:
check_query()
,
extract_hr()
,
flag_ch_ratio()
,
flag_em_ratio()
,
flag_extreme()
,
flag_outlooktime()
,
hr_trend()
,
hrvar_count()
,
hrvar_count_all()
,
hrvar_trend()
,
identify_churn()
,
identify_holidayweeks()
,
identify_inactiveweeks()
,
identify_nkw()
,
identify_privacythreshold()
,
identify_shifts()
,
identify_tenure()
,
track_HR_change()
,
validation_report()
Examples
identify_outlier(pq_data, metric = "Collaboration_hours")
#> # A tibble: 10 × 3
#> MetricDate Collaboration_hours zscore
#> <date> <dbl> <dbl>
#> 1 2022-05-01 19.4 0.601
#> 2 2022-05-08 18.0 -0.621
#> 3 2022-05-15 19.9 1.04
#> 4 2022-05-22 17.6 -0.997
#> 5 2022-05-29 18.0 -0.645
#> 6 2022-06-05 18.7 -0.00241
#> 7 2022-06-12 21.1 2.20
#> 8 2022-06-19 17.9 -0.736
#> 9 2022-06-26 18.1 -0.541
#> 10 2022-07-03 18.4 -0.299