This is used as part of data validation to check if there are extreme values in the dataset.
flag_extreme(
  data,
  metric,
  person = TRUE,
  threshold,
  mode = "above",
  return = "message"
)A Standard Person Query dataset in the form of a data frame.
A character string specifying the metric to test.
A logical value to specify whether to calculate
person-averages. Defaults to TRUE (person-averages calculated).
Numeric value specifying the threshold for flagging.
String determining mode to use for identifying extreme values.
"above": checks whether value is great than the threshold (default)
"equal": checks whether value is equal to the threshold
"below": checks whether value is below the threshold
String specifying what to return. This must be one of the following strings:
"text"
"message"
"table"
See Value for more information.
A different output is returned depending on the value passed to the return
argument:
"text": string. A diagnostic message.
"message": message on console. A diagnostic message.
"table": data frame. A person-level table with PersonId and the
extreme values of the selected metric.
Other Data Validation:
check_query(),
extract_hr(),
flag_ch_ratio(),
flag_em_ratio(),
flag_outlooktime(),
hr_trend(),
hrvar_count(),
hrvar_count_all(),
hrvar_trend(),
identify_churn(),
identify_holidayweeks(),
identify_inactiveweeks(),
identify_nkw(),
identify_outlier(),
identify_privacythreshold(),
identify_query(),
identify_shifts(),
identify_shifts_wp(),
identify_tenure(),
remove_outliers(),
standardise_pq(),
subject_validate(),
subject_validate_report(),
track_HR_change(),
validation_report()
# The threshold values are intentionally set low to trigger messages.
flag_extreme(sq_data, "Email_hours", threshold = 15)
#> [Warning] There are 48 persons where their average Email hours exceeds 15.
# Return a summary table
flag_extreme(sq_data, "Email_hours", threshold = 15, return = "table")
#> # A tibble: 48 × 2
#>    PersonId                                                         Email_hours
#>    <chr>                                                                  <dbl>
#>  1 00368A5686BF3E6F5189540B9D1434DCDAB931A025E1DA3F73388C45A23B8814        15.3
#>  2 07C28A72D1A47C05D6929A0D724FD0A4C81D945306DBAEF494199E96094469CD        18.3
#>  3 0B865E56B2EF3182F39F73337D67B5384E048674C9C86622EAB501BB81201B33        16.4
#>  4 10F20CF8E76802FCCF9C780A130D2A37874BAEAE1B1892365C9DB42E79C1D18A        17.2
#>  5 1449C12A92FD38EBA80D9CC037C464C515EE47EBC535ADD67E52D5561FADED4E        15.1
#>  6 1B069D86D80625D075798F32D1E67ADCD6156C659AF4BF2B62A47D34EB00138E        15.1
#>  7 2087D9D11ECC9B33440A36A702043B7F81913BA9572B3053F38DAA9C5FC52EDB        15.6
#>  8 2F76D6705EB4FA3B0138C5D8041A4A0708187C0A425343EC7FE7FCB46520D60A        16.4
#>  9 38B55F76510F34DEE6ECA960A04501866278A3105AF30A7A67F5F8A608EB3957        15.1
#> 10 3C2F9AF5D2AF19504D5DB24A78C19EDCA7990C8EEE00EEE239553E6E4A604B32        15.6
#> # ℹ 38 more rows
# Person-week level
flag_extreme(sq_data, "Email_hours", person = FALSE, threshold = 15)
#> [Warning] There are 616 rows where their value of Email hours exceeds 15.
# Check for values equal to threshold
flag_extreme(sq_data, "Email_hours", person = TRUE, mode = "equal", threshold = 0)
#> [Pass] There are no persons where their average Email hours are equal to 0.
# Check for values below threshold
flag_extreme(sq_data, "Email_hours", person = TRUE, mode = "below", threshold = 5)
#> [Warning] There are 7 persons where their average Email hours are less than 5.