pe.population package
- class pe.population.PEPopulation(api, histogram_threshold=None, initial_variation_api_fold=0, next_variation_api_fold=1, keep_selected=False, selection_mode='sample')[source]
Bases:
Population
The default population algorithm for Private Evolution.
- __init__(api, histogram_threshold=None, initial_variation_api_fold=0, next_variation_api_fold=1, keep_selected=False, selection_mode='sample')[source]
Constructor.
- Parameters:
api (
pe.api.API
) – The API object that contains the random and variation APIshistogram_threshold (float, optional) – The threshold for clipping the histogram. None means no clipping. Defaults to None
initial_variation_api_fold (int, optional) – The number of variations to apply to the initial synthetic data, defaults to 0
next_variation_api_fold (int, optional) – The number of variations to apply to the next synthetic data, defaults to 1
keep_selected (bool, optional) – Whether to keep the selected data in the next synthetic data, defaults to False
selection_mode (str, optional) – The selection mode for selecting the data. It should be one of the following: “sample” ( random sampling proportional to the histogram), “rank” (select the top samples according to the histogram). Defaults to “sample”
- Raises:
ValueError – If next_variation_api_fold is 0 and keep_selected is False
- _post_process_histogram(syn_data)[source]
Post process the histogram of synthetic data (e.g., clipping).
- Parameters:
syn_data (
pe.data.Data
) – The synthetic data- Returns:
The synthetic data with post-processed histogram in the column
pe.constant.data.POST_PROCESSED_DP_HISTOGRAM_COLUMN_NAME
- Return type:
- _select_data(syn_data, num_samples)[source]
Select data from the synthetic data according to selection_mode.
- Parameters:
syn_data (
pe.data.Data
) – The synthetic datanum_samples (int) – The number of samples to select
- Raises:
ValueError – If the selection mode is not supported
- Returns:
The selected data
- Return type:
- initial(label_info, num_samples)[source]
Generate the initial synthetic data.
- Parameters:
label_info (omegaconf.dictconfig.DictConfig) – The label info
num_samples (int) – The number of samples to generate
- Returns:
The initial synthetic data
- Return type:
- next(syn_data, num_samples)[source]
Generate the next synthetic data.
- Parameters:
syn_data (
pe.data.Data
) – The synthetic datanum_samples (int) – The number of samples to generate
- Returns:
The next synthetic data
- Return type:
- class pe.population.Population[source]
Bases:
ABC
The abstract class that generates synthetic data.
- abstract initial(label_info, num_samples)[source]
Generate the initial synthetic data.
- Parameters:
label_info (omegaconf.dictconfig.DictConfig) – The label info
num_samples (int) – The number of samples to generate
- abstract next(syn_data, num_samples)[source]
Generate the next synthetic data.
- Parameters:
syn_data (
pe.data.Data
) – The synthetic datanum_samples (int) – The number of samples to generate