This is used as part of data validation to check if there are extreme values in the dataset.

flag_extreme(
  data,
  metric,
  person = TRUE,
  threshold,
  mode = "above",
  return = "message"
)

Arguments

data

A Standard Person Query dataset in the form of a data frame.

metric

A character string specifying the metric to test.

person

A logical value to specify whether to calculate person-averages. Defaults to TRUE (person-averages calculated).

threshold

Numeric value specifying the threshold for flagging.

mode

String determining mode to use for identifying extreme values.

  • "above": checks whether value is great than the threshold (default)

  • "equal": checks whether value is equal to the threshold

  • "below": checks whether value is below the threshold

return

String specifying what to return. This must be one of the following strings:

  • "text"

  • "message"

  • "table"

See Value for more information.

Value

A different output is returned depending on the value passed to the return argument:

  • "text": string. A diagnostic message.

  • "message": message on console. A diagnostic message.

  • "table": data frame. A person-level table with PersonId and the extreme values of the selected metric.

Examples

# The threshold values are intentionally set low to trigger messages.
flag_extreme(sq_data, "Email_hours", threshold = 15)
#> [Warning] There are 48 persons where their average Email hours exceeds 15.

# Return a summary table
flag_extreme(sq_data, "Email_hours", threshold = 15, return = "table")
#> # A tibble: 48 × 2
#>    PersonId                                                         Email_hours
#>    <chr>                                                                  <dbl>
#>  1 00368A5686BF3E6F5189540B9D1434DCDAB931A025E1DA3F73388C45A23B8814        15.3
#>  2 07C28A72D1A47C05D6929A0D724FD0A4C81D945306DBAEF494199E96094469CD        18.3
#>  3 0B865E56B2EF3182F39F73337D67B5384E048674C9C86622EAB501BB81201B33        16.4
#>  4 10F20CF8E76802FCCF9C780A130D2A37874BAEAE1B1892365C9DB42E79C1D18A        17.2
#>  5 1449C12A92FD38EBA80D9CC037C464C515EE47EBC535ADD67E52D5561FADED4E        15.1
#>  6 1B069D86D80625D075798F32D1E67ADCD6156C659AF4BF2B62A47D34EB00138E        15.1
#>  7 2087D9D11ECC9B33440A36A702043B7F81913BA9572B3053F38DAA9C5FC52EDB        15.6
#>  8 2F76D6705EB4FA3B0138C5D8041A4A0708187C0A425343EC7FE7FCB46520D60A        16.4
#>  9 38B55F76510F34DEE6ECA960A04501866278A3105AF30A7A67F5F8A608EB3957        15.1
#> 10 3C2F9AF5D2AF19504D5DB24A78C19EDCA7990C8EEE00EEE239553E6E4A604B32        15.6
#> # ℹ 38 more rows

# Person-week level
flag_extreme(sq_data, "Email_hours", person = FALSE, threshold = 15)
#> [Warning] There are 616 rows where their value of Email hours exceeds 15.

# Check for values equal to threshold
flag_extreme(sq_data, "Email_hours", person = TRUE, mode = "equal", threshold = 0)
#> [Pass] There are no persons where their average Email hours are equal to 0.

# Check for values below threshold
flag_extreme(sq_data, "Email_hours", person = TRUE, mode = "below", threshold = 5)
#> [Warning] There are 7 persons where their average Email hours are less than 5.