pe.callback.tabular.compute_wsd module
- class pe.callback.tabular.compute_wsd.ComputeWSD(priv_data, degree, num_samples=None, seed=42, filter_criterion=None)[source]
Bases:
CallbackThe callback that computes the Wasserstein Distance (WSD) between the private and synthetic data.
- __call__(syn_data)[source]
This function is called after each PE iteration that computes the multiple-way WSD between the private and synthetic data.
- Parameters:
syn_data (
pe.data.Data) – The synthetic data- Returns:
The multiple-way WSD between the private and synthetic data
- Return type:
- __init__(priv_data, degree, num_samples=None, seed=42, filter_criterion=None)[source]
Constructor.
- Parameters:
priv_data (
pe.data.Data) – The private datadegree (int) – The degree of the WSD (e.g., 2 for 2-way WSD)
num_samples (int, optional) – The number of samples to use for the WSD for both private and synthetic data for computation efficiency. If None, all samples are used..
seed (int, optional) – The seed to use for for sampling the data.
filter_criterion (dict, optional) – Only computes the metric based on samples satisfying the criterion. None means no filtering. Defaults to None
- _compute_wsd(syn_features_df, priv_features_df)[source]
Compute the multiple-way WSD between the synthetic and private features.
- Parameters:
syn_features_df (
pandas.DataFrame) – The synthetic features DataFramepriv_features_df (
pandas.DataFrame) – The private features DataFrame
- Returns:
The multiple-way WSD
- Return type:
float
- _get_features_df(data)[source]
Get the features DataFrame from the data.
- Parameters:
data (
pe.data.Data) – The data- Returns:
The features DataFrame
- Return type:
pandas.DataFrame