This is used as part of data validation to check if there are extreme values in the dataset.

flag_extreme(
  data,
  metric,
  person = TRUE,
  threshold,
  mode = "above",
  return = "message"
)

Arguments

data

A Standard Person Query dataset in the form of a data frame.

metric

A character string specifying the metric to test.

person

A logical value to specify whether to calculate person-averages. Defaults to TRUE (person-averages calculated).

threshold

Numeric value specifying the threshold for flagging.

mode

String determining mode to use for identifying extreme values.

  • "above": checks whether value is great than the threshold (default)

  • "equal": checks whether value is equal to the threshold

  • "below": checks whether value is below the threshold

return

String specifying what to return. This must be one of the following strings:

  • "text"

  • "message"

  • "table"

See Value for more information.

Value

A different output is returned depending on the value passed to the return

argument:

  • "text": string. A diagnostic message.

  • "message": message on console. A diagnostic message.

  • "table": data frame. A person-level table with PersonId and the extreme values of the selected metric.

Examples

# The threshold values are intentionally set low to trigger messages.
flag_extreme(sq_data, "Email_hours", threshold = 15)
#> [Warning] There are 58 persons where their average Email hours exceeds 15.

# Return a summary table
flag_extreme(sq_data, "Email_hours", threshold = 15, return = "table")
#> # A tibble: 58 × 2
#>    PersonId                                                         Email_hours
#>    <chr>                                                                  <dbl>
#>  1 00368A5686BF3E6F5189540B9D1434DCDAB931A025E1DA3F73388C45A23B8814        15.1
#>  2 05777AAC2F33DF19EB3CFFDB73E1ADE501D4ED9048B7CBF79E31BA545E972644        15.7
#>  3 07C28A72D1A47C05D6929A0D724FD0A4C81D945306DBAEF494199E96094469CD        18.1
#>  4 0B865E56B2EF3182F39F73337D67B5384E048674C9C86622EAB501BB81201B33        15.5
#>  5 10F20CF8E76802FCCF9C780A130D2A37874BAEAE1B1892365C9DB42E79C1D18A        17.0
#>  6 1EE5750C86AB036EF7A5F3C948E9C7A4D4424BB366E3752D788A56494C8EB474        16.9
#>  7 1EFA17977AB9136E0CD54D28775F058851CD89CFFB01C666E6A1170075DC4356        16.1
#>  8 1F70108AAA964A8C2E88DA666B0BA15EBA0B0BC1970A1BE0575266611C0CE69C        15.6
#>  9 2087D9D11ECC9B33440A36A702043B7F81913BA9572B3053F38DAA9C5FC52EDB        15.4
#> 10 24E9403ACF368A82EAE889DBAFF9785200C4D635DA7E034A3745FD72C6E1B3A9        19.7
#> # … with 48 more rows

# Person-week level
flag_extreme(sq_data, "Email_hours", person = FALSE, threshold = 15)
#> [Warning] There are 1641 rows where their value of Email hours exceeds 15.

# Check for values equal to threshold
flag_extreme(sq_data, "Email_hours", person = TRUE, mode = "equal", threshold = 0)
#> [Pass] There are no persons where their average Email hours are equal to 0.

# Check for values below threshold
flag_extreme(sq_data, "Email_hours", person = TRUE, mode = "below", threshold = 5)
#> [Warning] There are 11 persons where their average Email hours are less than 5.