Skip to contents

This is used as part of data validation to check if there are extreme values in the dataset.

Usage

flag_extreme(
  data,
  metric,
  person = TRUE,
  threshold,
  mode = "above",
  return = "message"
)

Arguments

data

A Standard Person Query dataset in the form of a data frame.

metric

A character string specifying the metric to test.

person

A logical value to specify whether to calculate person-averages. Defaults to TRUE (person-averages calculated).

threshold

Numeric value specifying the threshold for flagging.

mode

String determining mode to use for identifying extreme values.

  • "above": checks whether value is great than the threshold (default)

  • "equal": checks whether value is equal to the threshold

  • "below": checks whether value is below the threshold

return

String specifying what to return. This must be one of the following strings:

  • "text"

  • "message"

  • "table"

See Value for more information.

Value

A different output is returned depending on the value passed to the return

argument:

  • "text": string. A diagnostic message.

  • "message": message on console. A diagnostic message.

  • "table": data frame. A person-level table with PersonId and the extreme values of the selected metric.

Examples

# The threshold values are intentionally set low to trigger messages.
flag_extreme(pq_data, "Email_hours", threshold = 15)
#> [Pass] There are no persons where their average Email hours exceeds 15.

# Return a summary table
flag_extreme(pq_data, "Email_hours", threshold = 15, return = "table")
#> # A tibble: 0 × 2
#> # ℹ 2 variables: PersonId <chr>, Email_hours <dbl>

# Person-week level
flag_extreme(pq_data, "Email_hours", person = FALSE, threshold = 15)
#> [Pass] There are no rows where their value of Email hours exceeds 15.

# Check for values equal to threshold
flag_extreme(pq_data, "Email_hours", person = TRUE, mode = "equal", threshold = 0)
#> [Pass] There are no persons where their average Email hours are equal to 0.

# Check for values below threshold
flag_extreme(pq_data, "Email_hours", person = TRUE, mode = "below", threshold = 5)
#> [Warning] There are 100 persons where their average Email hours are less than 5.