pe.population package

Submodules

pe.population.pe_population module

class pe.population.pe_population.PEPopulation(api, histogram_threshold, initial_variation_api_fold=0, next_variation_api_fold=1, keep_selected=False, selection_mode='sample')[source]

Bases: Population

The default population algorithm for Private Evolution.

__init__(api, histogram_threshold, initial_variation_api_fold=0, next_variation_api_fold=1, keep_selected=False, selection_mode='sample')[source]

Constructor.

Parameters:
  • api (pe.api.api.API) – The API object that contains the random and variation APIs

  • histogram_threshold (float) – The threshold for clipping the histogram

  • initial_variation_api_fold (int, optional) – The number of variations to apply to the initial synthetic data, defaults to 0

  • next_variation_api_fold (int, optional) – The number of variations to apply to the next synthetic data, defaults to 1

  • keep_selected (bool, optional) – Whether to keep the selected data in the next synthetic data, defaults to False

  • selection_mode (str, optional) – The selection mode for selecting the data. It should be one of the following: “sample”( random sampling proportional to the histogram). Defaults to “sample”

Raises:

ValueError – If next_variation_api_fold is 0 and keep_selected is False

_post_process_histogram(syn_data)[source]

Post process the histogram of synthetic data (e.g., clipping).

Parameters:

syn_data (pe.data.data.Data) – The synthetic data

Returns:

The synthetic data with post-processed histogram in the column pe.constant.data.POST_PROCESSED_DP_HISTOGRAM_COLUMN_NAME

Return type:

pe.data.data.Data

_select_data(syn_data, num_samples)[source]

Select data from the synthetic data according to selection_mode.

Parameters:
  • syn_data (pe.data.data.Data) – The synthetic data

  • num_samples (int) – The number of samples to select

Raises:

ValueError – If the selection mode is not supported

Returns:

The selected data

Return type:

pe.data.data.Data

initial(label_info, num_samples)[source]

Generate the initial synthetic data.

Parameters:
  • label_info (dict) – The label info

  • num_samples (int) – The number of samples to generate

Returns:

The initial synthetic data

Return type:

pe.data.data.Data

next(syn_data, num_samples)[source]

Generate the next synthetic data.

Parameters:
  • syn_data (pe.data.data.Data) – The synthetic data

  • num_samples (int) – The number of samples to generate

Returns:

The next synthetic data

Return type:

pe.data.data.Data

pe.population.population module

class pe.population.population.Population[source]

Bases: ABC

The abstract class that generates synthetic data.

abstract initial(label_info, num_samples)[source]

Generate the initial synthetic data.

Parameters:
  • label_info (dict) – The label info

  • num_samples (int) – The number of samples to generate

abstract next(syn_data, num_samples)[source]

Generate the next synthetic data.

Parameters:
  • syn_data (pe.data.data.Data) – The synthetic data

  • num_samples (int) – The number of samples to generate