R/subject_scan.R
subject_scan.Rd
This function generates a matrix of the top occurring words in meetings, grouped by a specified attribute such as organisational attribute, day of the week, or hours of the day.
subject_scan(
data,
hrvar,
mode = NULL,
top_n = 10,
token = "words",
return = "plot",
weight = NULL,
stopwords = NULL,
...
)
tm_scan(
data,
hrvar,
mode = NULL,
top_n = 10,
token = "words",
return = "plot",
weight = NULL,
stopwords = NULL,
...
)
A Meeting Query dataset in the form of a data frame.
String containing the name of the HR Variable by which to split
metrics. Note that the prefix 'Organizer_'
or equivalent will be
required.
String specifying what variable to use for grouping subject words. Valid values include:
"hours"
"days"
NULL
(defaults to hrvar
)
When the value passed to mode
is not NULL
, the value passed to hrvar
will be discarded and instead be over-written by setting specified in mode
.
Numeric value specifying the top number of words to show.
A character vector accepting either "words"
or "ngrams"
,
determining type of tokenisation to return.
String specifying what to return. This must be one of the following strings:
"plot"
"table"
"data"
See Value
for more information.
String specifying the column name of a numeric variable for
weighting data, such as "Invitees"
. The column must contain positive
integers. Defaults to NULL
, where no weighting is applied.
A character vector OR a single-column data frame labelled
'word'
containing custom stopwords to remove.
Additional parameters to pass to tm_clean()
.
A different output is returned depending on the value passed to the return
argument:
"plot"
: 'ggplot' object. A heatmapped grid.
"table"
: data frame. A summary table for the metric.
"data"
: data frame.
# return a heatmap table for words
mt_data %>% subject_scan(hrvar = "Organizer_Organization")
# return a heatmap table for ngrams
mt_data %>%
subject_scan(
hrvar = "Organizer_Organization",
token = "ngrams",
n = 2)
# return raw table format
mt_data %>% subject_scan(hrvar = "Organizer_Organization", return = "table")
#> # A tibble: 10 × 16
#> Biz D…¹ CEO Custo…² Facil…³ Finan…⁴ Finan…⁵ Finan…⁶ Finan…⁷ Finan…⁸ G&A C…⁹
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 update annu… updated update weekly weekly weekly volome… weekly volome…
#> 2 project board update updated updated transi… meeting weekly interv… update
#> 3 review disc… meeting interv… meeting visit updated update updated discus…
#> 4 weekly netw… interv… team interv… review volome… interv… messag… enterp…
#> 5 meeting nick plan volo volome… update plan project staff report
#> 6 plan plan volome… weekly project updates test confer… update updated
#> 7 status pred… todd recurr… update interv… update discus… volome… weekly
#> 8 updated repo… transi… review review lunch visit product review product
#> 9 visit spar… visit sales traini… messag… volo traini… direct board
#> 10 volome… stra… apple testing chris service market… visit extrac… confer…
#> # … with 6 more variables: `G&A East` <chr>, `G&A South` <chr>,
#> # `Human Resources` <chr>, `IT-Corporate` <chr>, `IT-East` <chr>,
#> # `Inventory Management` <chr>, and abbreviated variable names ¹`Biz Dev`,
#> # ²`Customer Service`, ³Facilities, ⁴`Finance-Corporate`, ⁵`Finance-East`,
#> # ⁶`Finance-South`, ⁷`Finance-West`, ⁸`Financial Planning`, ⁹`G&A Central`
# grouped by hours
mt_data %>% subject_scan(mode = "hours")
# grouped by days
mt_data %>% subject_scan(mode = "days")