pe.population package
Submodules
pe.population.pe_population module
- class pe.population.pe_population.PEPopulation(api, histogram_threshold, initial_variation_api_fold=0, next_variation_api_fold=1, keep_selected=False, selection_mode='sample')[source]
Bases:
Population
The default population algorithm for Private Evolution.
- __init__(api, histogram_threshold, initial_variation_api_fold=0, next_variation_api_fold=1, keep_selected=False, selection_mode='sample')[source]
Constructor.
- Parameters:
api (
pe.api.api.API
) – The API object that contains the random and variation APIshistogram_threshold (float) – The threshold for clipping the histogram
initial_variation_api_fold (int, optional) – The number of variations to apply to the initial synthetic data, defaults to 0
next_variation_api_fold (int, optional) – The number of variations to apply to the next synthetic data, defaults to 1
keep_selected (bool, optional) – Whether to keep the selected data in the next synthetic data, defaults to False
selection_mode (str, optional) – The selection mode for selecting the data. It should be one of the following: “sample”( random sampling proportional to the histogram). Defaults to “sample”
- Raises:
ValueError – If next_variation_api_fold is 0 and keep_selected is False
- _post_process_histogram(syn_data)[source]
Post process the histogram of synthetic data (e.g., clipping).
- Parameters:
syn_data (
pe.data.data.Data
) – The synthetic data- Returns:
The synthetic data with post-processed histogram in the column
pe.constant.data.POST_PROCESSED_DP_HISTOGRAM_COLUMN_NAME
- Return type:
- _select_data(syn_data, num_samples)[source]
Select data from the synthetic data according to selection_mode.
- Parameters:
syn_data (
pe.data.data.Data
) – The synthetic datanum_samples (int) – The number of samples to select
- Raises:
ValueError – If the selection mode is not supported
- Returns:
The selected data
- Return type:
- initial(label_info, num_samples)[source]
Generate the initial synthetic data.
- Parameters:
label_info (dict) – The label info
num_samples (int) – The number of samples to generate
- Returns:
The initial synthetic data
- Return type:
- next(syn_data, num_samples)[source]
Generate the next synthetic data.
- Parameters:
syn_data (
pe.data.data.Data
) – The synthetic datanum_samples (int) – The number of samples to generate
- Returns:
The next synthetic data
- Return type:
pe.population.population module
- class pe.population.population.Population[source]
Bases:
ABC
The abstract class that generates synthetic data.
- abstract initial(label_info, num_samples)[source]
Generate the initial synthetic data.
- Parameters:
label_info (dict) – The label info
num_samples (int) – The number of samples to generate
- abstract next(syn_data, num_samples)[source]
Generate the next synthetic data.
- Parameters:
syn_data (
pe.data.data.Data
) – The synthetic datanum_samples (int) – The number of samples to generate