Tools for finding heterogeneous treatment effects (and means) based on partitioning the covariate/feature space via full cross-cuts and solved via greedy search. A typical usage would be analyzing and experiment to find the high-level subgroups (a coarse partition that is useful to humans) that differ in their estimated treatment effects.

Details

This package is inspired by, and uses ideas from, Causal Tree but aims to have the partition be more interpretable and have better accuracy. It is slower, though for high-level partitions this is usually not an issue.

Subgroups are constructed as a grid over the features/covariates X. For example, with 1 feature going from 0 to 1 it may split at values c1, c2, resulting in segments [0,c1], (c1, c2], (c2,1]. A split at value c means it splits <= and >. The segments may be of uneven sizes. Splits along several features result in a grid by constructing the Cartesian product of the feature-specific splits. Not all features will necessarily be split or split the same number of times.

The main entry point is fit_estimate_partition.

Randomization: This package should be able to be run with no randomness. With default/simple parameters the following places randomize but can be overridden.

  • Generating train/est splits. Can be overridden by providing tr_split

  • Generating trtr/trcv splits. Can be overridden by providing cv_folds

  • Bumping samples. Can be overridden by providing list of samples for bump_samples

  • Estimation plans: Provide ones ( lm_est(lasso=TRUE,...) and grid_rf) use cv_folds. User-made ones should too.