mutations.Rmd
{datamations} supports the limited definition of mutations in a pipeline. It is capable of showing a single mutation involving multiple variables in a scatterplot or grid fashion.
We can define new data to use in some example mutations.
library(dplyr)
library(datamations)
small_salary <- dplyr::mutate(
small_salary,
supplementalIncome = runif(nrow(small_salary), min = 60, max = 110),
logNorm = rlnorm(nrow(small_salary), meanlog = 0, sdlog = 1)
)
{datamations} can visualize mutations to help one understand mathematical distributions, scales, and relationships.
"small_salary %>%
mutate(logged = log10(logNorm)) %>%
group_by(Degree) %>%
summarize(mean = mean(logged))" %>%
datamation_sanddance()
#> Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
#> dplyr 1.1.0.
#> ℹ Please use `reframe()` instead.
#> ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
#> always returns an ungrouped data frame and adjust accordingly.
#> ℹ The deprecated feature was likely used in the datamations package.
#> Please report the issue to the authors.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
"small_salary %>%
mutate(salarySquared = Salary^2) %>%
group_by(Degree) %>%
summarize(mean = mean(salarySquared))" %>%
datamation_sanddance()
{datamations} can also showcase the relationship between more than variable in your data pipelines. We can see below the relationship between our Salary variable and a new variable and use the mutation in grouping, filtering, and summarization.
"small_salary %>%
mutate(totalIncome = Salary + supplementalIncome) %>%
group_by(Degree) %>%
summarize(mean = mean(totalIncome))" %>%
datamation_sanddance()
{datamations} will allow the definition of a mutate statement with multiple mutates, but it will ignore anything after the first defined mutate. Two variable mutates results in a warning.
"small_salary %>%
mutate(totalIncome = Salary + supplementalIncome, squaredIncome = Salary^2) %>%
group_by(Degree) %>%
summarize(mean = mean(totalIncome))" %>%
datamation_sanddance()
#> Error in generate_mapping(data_states, tidy_function_args, plot_mapping): Datamations currently only supports a single mutation call for visualization. Edit your pipeline to only include a single mutation necessary for the visualization.