AggregateBalanceMeasure
These measures look at the distribution of records across all value combinations of sensitive feature columns. For example, if sex
and race
are specified as
sensitive features, the API tries to quantify imbalance across all combinations of the specified features (e.g., [Male, Black]
, [Female, White]
, [Male, Asian
Pacific Islander]
)
Measure |
Description |
Interpretation |
---|---|---|
The Atkinson index presents the |
Range |
|
|
If everyone has the same income, |
|
GE(0) = Theil L, which is more |
Same interpretation as |
- class raimitigations.databalanceanalysis.aggregate_measures.AggregateBalanceMeasure(sensitive_cols: List[str])
Bases:
BalanceMeasure
- AGGREGATE_METRICS: Dict[Measures, Callable[[array], float]] = {Measures.ATKINSON_INDEX: <function get_atkinson_index>, Measures.THEIL_L_INDEX: <function get_theil_l_index>, Measures.THEIL_T_INDEX: <function get_theil_t_index>}
- measures(df: DataFrame) DataFrame
- The output is a dataframe that maps the names of the different aggregate measures to their values:
The following measures are computed:
Atkinson Index - https://en.wikipedia.org/wiki/Atkinson_index
Theil Index (L and T) - https://en.wikipedia.org/wiki/Theil_index
- Parameters
df (pd.DataFrame) – the df to calculate aggregate measures on
- Returns
returns a dataframe that has one column that is the name of the aggregate measure, the second column contains the values for each of the metrics of interest.
- Return type
pd.DataFrame