Anonymise a categorical variable by replacing values

Anonymize categorical variables such as HR variables by replacing values with dummy team names such as 'Team A'. The behaviour is to make 1 to 1 replacements by default, but there is an option to completely randomise values in the categorical variable.

anonymise(x, scramble = FALSE, replacement = NULL)

anonymize(x, scramble = FALSE, replacement = NULL)

Arguments

x: Character vector to be passed through.
scramble: Logical value determining whether to randomise values in the categorical variable.
replacement: Character vector containing the values to replace original values in the categorical variable. The length of the vector must be at least as great as the number of unique values in the original variable. Defaults to NULL, where the replacement would consist of "Team A", "Team B", etc.

Examples

unique(anonymise(sq_data$Organization))
#>  [1] "Team A" "Team B" "Team C" "Team D" "Team E" "Team F" "Team G" "Team H"
#>  [9] "Team I" "Team J" "Team K" "Team L" "Team M" "Team N" "Team O"

rep <- c("Manager+", "Manager", "IC")
unique(anonymise(sq_data$Layer), replacement = rep)
#> [1] "Team A" "Team B" "Team C"

Anonymise a categorical variable by replacing values

Arguments

See also

Examples