Skip to contents

Provides an analysis of the distribution of a selected metric. Returns a faceted histogram by default. Additional options available to return the underlying frequency table.

Usage

create_hist(
  data,
  metric,
  hrvar = "Organization",
  mingroup = 5,
  binwidth = 1,
  ncol = NULL,
  return = "plot"
)

Arguments

data

A Standard Person Query dataset in the form of a data frame.

metric

String containing the name of the metric, e.g. "Collaboration_hours"

hrvar

String containing the name of the HR Variable by which to split metrics. Defaults to "Organization". To run the analysis on the total instead of splitting by an HR attribute, supply NULL (without quotes).

mingroup

Numeric value setting the privacy threshold / minimum group size. Defaults to 5.

binwidth

Numeric value for setting binwidth argument within ggplot2::geom_histogram(). Defaults to 1.

ncol

Numeric value setting the number of columns on the plot. Defaults to NULL (automatic).

return

String specifying what to return. This must be one of the following strings:

  • "plot"

  • "table"

  • "data"

  • "frequency"

See Value for more information.

Value

A different output is returned depending on the value passed to the return argument:

  • "plot": 'ggplot' object. A faceted histogram for the metric.

  • "table": data frame. A summary table for the metric.

  • "data": data frame. Data with calculated person averages.

  • "frequency: list of data frames. Each data frame contains the frequencies used in each panel of the plotted histogram.

Examples

# Return plot for whole organization
create_hist(pq_data, metric = "Collaboration_hours", hrvar = NULL)


# Return plot
create_hist(pq_data, metric = "Collaboration_hours", hrvar = "Organization")


# Return plot but coerce plot to 3 columns
create_hist(pq_data, metric = "Collaboration_hours", hrvar = "Organization", ncol = 3)


# Return summary table
create_hist(pq_data,  metric = "Collaboration_hours", hrvar = "Organization", return = "table")
#> # A tibble: 4 × 6
#>   group                mean median   max   min Employee_Count
#>   <chr>               <dbl>  <dbl> <dbl> <dbl>          <int>
#> 1 Finance              16.7   13.4  54.5  8.83             27
#> 2 HR                   17.8   12.3 119.   8.99             21
#> 3 Product              11.7   10.8  32.5  8.13             21
#> 4 Sales and Marketing  25.8   13.3 119.   7.08             31