In [ ]:
Copied!
# download presidio
!pip install presidio_analyzer presidio_anonymizer
!python -m spacy download en_core_web_lg
# download presidio
!pip install presidio_analyzer presidio_anonymizer
!python -m spacy download en_core_web_lg
Keeping some PIIs from being anonymized¶
This sample shows how to use Presidio's keep
anonymizer to keep some of the identified PIIs in the output string
Set up imports¶
In [1]:
Copied!
from presidio_anonymizer import AnonymizerEngine
from presidio_anonymizer.entities import RecognizerResult, OperatorConfig
from presidio_anonymizer import AnonymizerEngine
from presidio_anonymizer.entities import RecognizerResult, OperatorConfig
Presidio Anonymizer: Keep person names¶
This example input has 2 PIIs, an person name and a location. We configure the anonymizer to replace the location name with a placeholder, but keep the person name unmodified.
In [2]:
Copied!
engine = AnonymizerEngine()
# Invoke the anonymize function with the text,
# analyzer results (potentially coming from presidio-analyzer)
# and 'keep' operator on <PERSON> PIIs
anonymize_result = engine.anonymize(
text="My name is James Bond, I live in London",
analyzer_results=[
RecognizerResult(entity_type="PERSON", start=11, end=21, score=0.8),
RecognizerResult(entity_type="LOCATION", start=33, end=39, score=0.8),
],
operators={
"PERSON": OperatorConfig("keep"),
"DEFAULT": OperatorConfig("replace"),
},
)
engine = AnonymizerEngine()
# Invoke the anonymize function with the text,
# analyzer results (potentially coming from presidio-analyzer)
# and 'keep' operator on PIIs
anonymize_result = engine.anonymize(
text="My name is James Bond, I live in London",
analyzer_results=[
RecognizerResult(entity_type="PERSON", start=11, end=21, score=0.8),
RecognizerResult(entity_type="LOCATION", start=33, end=39, score=0.8),
],
operators={
"PERSON": OperatorConfig("keep"),
"DEFAULT": OperatorConfig("replace"),
},
)
Result: Name unmodified, but tracked¶
The person name is preserved in the result text, but remains tracked in the items list.
In [3]:
Copied!
anonymize_result
anonymize_result
Out[3]:
text: My name is James Bond, I live in <LOCATION> items: [ {'start': 33, 'end': 43, 'entity_type': 'LOCATION', 'text': '<LOCATION>', 'operator': 'replace'}, {'start': 11, 'end': 21, 'entity_type': 'PERSON', 'text': 'James Bond', 'operator': 'keep'} ]