Skip to content

Example 11: Custom anonymization

Presidio-anonymizer can accept arbitrary operations to perform on identified entities. These operations can be passed in the form of a lambda function.

In the following example, we use fake values to perform pseudonymization. First, let's look at the operator:

from faker import Faker
from presidio_anonymizer.entities import OperatorConfig

fake = Faker()

# Create faker function (note that it has to receive a value)
def fake_name(x):
    return fake.name()


# Create custom operator for the PERSON entity
operators = {"PERSON": OperatorConfig("custom", {"lambda": fake_name})}

Full example:

from presidio_anonymizer import AnonymizerEngine
from presidio_anonymizer.entities import OperatorConfig, EngineResult, RecognizerResult
from faker import Faker


fake = Faker()

# Create faker function (note that it has to receive a value)
def fake_name(x):
    return fake.name()


# Create custom operator for the PERSON entity
operators = {"PERSON": OperatorConfig("custom", {"lambda": fake_name})}

# Analyzer output
analyzer_results = [RecognizerResult(entity_type="PERSON", start=11, end=18, score=0.8)]

text_to_anonymize = "My name is Raphael and I like to fish."

anonymizer = AnonymizerEngine()

anonymized_results = anonymizer.anonymize(
    text=text_to_anonymize, analyzer_results=analyzer_results, operators=operators
)

print(anonymized_results.text)

This is a simple example, but here are some examples for more advanced anonymization options:

  • Identify the gender and create a random value from the same gender (e.g., Laura -> Pam)
  • Identifying the date pattern and perform date shift (01-01-2020 -> 05-01-2020)
  • Identify the age and bucket by decade (89 -> 80-90)