FeatureBalanceMeasure

These measure whether each combination of sensitive features is receiving the positive outcome (true prediction) at balanced probabilities. Many of these metrics were influenced by the paper, Measuring Model Biases in the Absence of Ground Truth (Osman Aka, Ken Burke, Alex Bäuerle, Christina Greer, Margaret Mitchell).

Association
Metric

Family

Description

Interpretation /
Formula

Statistical parity

Fairness

The proportion of each segment
of a protected class (e.g.,
gender) should receive the
positive outcome at equal
rates.

Parity increases with
proximity to 0.

DP = P(Y|A=“Male”)-
P(Y|A=“Female”)

Pointwise
mutual
information
(PMI),
normalized PMI

Entropy

The PMI of a pair of feature
values (e.g., Gender=Male
and Gender=Female) quantifies
the discrepancy between the
probability of their
coincidence, given their
joint distribution and their
individual distributions
(assuming independence).

Range (normalized) [−1,1]

-1 for no co-occurences

0 for co-occurences at
random

1 for complete
co-occurences

Sorensen-Dice
coefficient
(SDC)

Intersection
over union

The SDC is used to gauge the
similarity of two samples
and is related to F1 score.

Equals twice the number of
elements common to both
sets divided by the sum
of the number of elements
in each set.

Jaccard index

Intersection
over union

Similar to SDC, the Jaccard
index guages the similarity
and diversity of sample sets.

Equals the size of the
intersection divided by
the size of the union of
the sample sets.

Kendall rank
correlation

Correlation
and
statistical
tests

This is used to measure the
ordinal association between
two measured quantities.

High when observations
have a similar rank
between the two variables
and low when observations
have a dissimilar rank.

Log-
likelihood
ratio

Correlation
and
statistical
tests

This metric calculates the
degree to which data
supports one variable versus
another. The log-likelihood
ratio gives the probability
of correctly predicting the
label in ratio to
probability of incorrectly
predicting label.

If likelihoods are similar,
it should be close to 0.

T-test

Correlation
and
statistical
tests

The t-test is used to
compare the means of two
groups (pairwise).

The value that is being
assessed for statistical
significance in the
t-distribution.

class raimitigations.databalanceanalysis.feature_measures.FeatureBalanceMeasure(sensitive_cols: List[str], label_col: str)

Bases: BalanceMeasure

CLASS_A = 'ClassA'
CLASS_B = 'ClassB'
FEATURE_METRICS: Dict[Measures, Callable[[float, float, float, float], float]] = {Measures.DEMOGRAPHIC_PARITY: <function get_demographic_parity>, Measures.JACCARD_INDEX: <function get_jaccard_index>, Measures.KR_CORRELATION: <function get_kr_correlation>, Measures.LOG_LIKELIHOOD: <function get_log_likelihood_ratio>, Measures.POINTWISE_MUTUAL_INFO: <function get_point_mutual>, Measures.SD_COEF: <function get_sorenson_dice>, Measures.TTEST: <function get_t_test_stat>}
OVERALL_METRICS: Dict[Tuple[Measures, Measures], Callable[[float, int], float]] = {(<Measures.TTEST_PVALUE: 'ttest_pvalue'>, <Measures.TTEST: 't_test'>): <function get_t_test_p_value>}
measures(df: DataFrame) DataFrame

The output is a dictionary that maps the sensitive column table to Pandas dataframe containing the following

This output dataframe contains a row per combination of feature values for each sensitive feature.

Parameters

df (pd.DataFrame) – the df to calculate all of the feature balance measures on

Returns

a dataframe that contains 4 columns, first column is the sensitive feature’s name, 2nd column is one possible value of that sensitive feature, the 3rd column is a different possible value of that feature and the last column is a dictionary which indicates

Return type

pd.DataFrame