DP-SGD Calculator

Differentially-Private Stochastic Gradient Descent (DP-SGD) is an algorithm used for training neural networks in a privacy-preserving manner. It was proposed by Abadi et al. in 2016, and it has been implemented in several popular libraries, such as Opacus, based on PyTorch.

DP-SGD adds noise to the gradients during training to preserve the privacy of individual training data records. More noise implies better privacy, but this comes with a utility cost: the model will be less accurate. The privacy is typically evaluated according to (ε,δ)-DP values; roughly speaking, they bound the amount of information leakage (ε), and the probability with which this could be violated (δ). These values do not assume any specific threat (i.e. the attacker's objective).

The interactive tool below gives guidance on how to select the DP-SGD parameters (e.g., amount of noise) to achieve a certain level of resilience against a specific threat: membership inference. It has been shown that, if a system is resilient against membership inference, it is also resilient against all other privacy attacks targeting individual records. This tool is based on recent research by G. Cherubin, B. Köpf, A. Paverd, S. Tople, L. Wutschitz, and S. Zanella-Béguelin, which shows that the level of privacy achieved by training with DP-SGD can be approximated via a closed-form expression.

Interactive Privacy Calculator for Membership Inference

The tool below calculates the resilience of a model against membership attacks in terms of Bayes security. This metric indicates how much information an attacker can gain by using the model, compared to an attacker who has no access to the model and relies only on their prior knowledge. It takes values between [0,1], where 1 (perfect security) means that the model does not provide any additional information to the attacker. This metric can thus be seen as describing the risk of releasing a model, with respect to the chosen threat.

This tool is offered as a research prototype, and it should be used only as a guidance.

Number of data records: 1000000
Batch size: 128
Noise parameter (sigma): 2.5

The colored bands in the above calculator provide an example of how the results could be interpreted for a typical scenario. In this example, the bands are defined as follows:

Green (Bayes security ≥ 0.98): in typical applications, this indicates reasonably strong resilience against membership inference attacks. An optimal attacker that wants to limit their false positive rate (FPR) to 10% can achieve a true positive rate (TPR) of at most 12%. For comparison, an attacker without access to the model can achieve a TPR of at most 10%.
Orange (Bayes security between [0.9, 0.98]): the model leaks a fair amount of information, which can be exploited in membership inference attacks. An attacker aiming for FPR ≤ 10% can achieve up to 20% TPR. It may be beneficial to consider techniques for empirical estimation of privacy (e.g. F-BLEAU).
Red (Bayes security < 0.9): the model is most likely not resilient against membership inference attacks. It may be advisable to consider either adjusting the DP-SGD parameters. If membership inference is not a concern for a given scenario, an alternative is to consider resilience against attribute inference.

Note: this interpretation is given as an example; users must determine the acceptable level of risk for their own scenarios.

The above interactive tool is based on the following assumptions:

White-box attacker: That the attacker can see the internal model updates during training. In practice, however, this is not always the case. A more realistic attacker can only query the model, and observe the outputs.
Worst-case data: That the dataset used for training contains the worst-case data: because some data records may leak more information than others, our analysis assumes that the dataset contains the ones that facilitate an attacker the most. In practice, of course, the data used for training may not attain this worst-case.

This means that the analysis is conservative, and thus real-world deployments are likely to exhibit have higher resilience against membership inference attacks.

The above analysis is based on an approximate expression for the Bayes security of DP-SGD, which we observed to work well (up to 0.015 absolute error on Bayes security) for the parameters ranged that the above tool allows. For a more accurate analysis, you may want to use a moment accountant (e.g., PRV accountant or PLD accountant).

Bayes Security

Bayes security measures the advantage of an attacker performing an attack with access to the trained model, versus an attacker who has no access to the model and relies only on their prior knowledge. This is based on the concept of advantage, which is widely used both in cryptography and in privacy-preserving ML. Bayes security can be converted to other commonly-used metrics:

True positive rate at chosen false positive rate (TPR @ FPR). One way of interpreting Bayes security is by matching it to the maximum achievable true positive rate that an attacker can achieve when aiming for a (low) false positive rate; this metric is called TPR @ FPR. Intuitively, a good attack is one that confidently makes good predictions. The following tool shows the maximum achievable TPR for various choices of FPR, for given values of Bayes security.

Bayes security: 1

Attacker success rate. Another useful way to interpret a value of Bayes security is by matching it to the probability that an attack will succeed (attacker success rate). Because Bayes security measures the relation between an attacker with and without access to the model, to compute the success rate we need to assume a prior; that is: what is the probability that an attacker will succeed in the attack without access to the model?

Bayes security: 1

Prior probability:

The attacker is at most 50% likely to succeed in guessing whether a data record is a member or not, assuming that members have the above prior.

Data-Dependent Quantification of Attribute Inference

The analysis above considers membership inference, in which the attacker is assumed to have access to a complete data record and uses the model to ascertain whether that record was included in the training dataset. This is depicted graphically below:

It has been shown that resilience against membership inference implies an equivalent level of resilience against all other privacy attacks that target individual data records. However, there may be scenarios in which membership inference is not a concern. In these scenarios it may be possible to obtain stronger resilience bounds for other types of privacy attacks.

For example, in attribute inference, the attacker is assumed to have partial knowledge of a data record and uses the model to infer the remainder of the data record. This is depicted graphically below:

We developed a tool that enables quantifying resilience against attribute inference. This analysis is data-dependent, meaning that the results depend on both the DP-SGD training parameters as well as the specific dataset used for training. If the dataset itself is inherently resilient against attribute inference, this will be reflected in the results.

The tool is based on the Opacus implementation of DP-SGD, and is available at: https://github.com/microsoft/dpsgd-calculator.

Support and Attribution

If you have a usage question, have found a bug, or have a suggestion for improvement, please file a Github issue, or get in touch at gcherubin [at] microsoft.com.

This calculator is based on the closed-form expression of the Bayes security metric for DP-SGD. Please cite as:

  @inproceedings{cherubin2024closed,
       title={Closed-Form Bounds for DP-SGD against Record-level Inference},
       author={Cherubin, Giovanni and Köpf, Boris and Paverd, Andrew and Tople, Shruti and Wutschitz, Lukas and Zanella-Béguelin, Santiago},
       booktitle={33rd USENIX Security Symposium (USENIX Security 24)},
       year={2024}
 }