Presidio Revamp (aka V2)
As of March 2021, Presidio underwent a revamp to a new version referred to as V2.
The main changes introduced in V2 are:
- gRPC replaced with HTTP to allow more customizable APIs and easier debugging
-
Focus on the Analyzer and Anonymizer services.
- Presidio Anonymizer is now Python-based and pip-installable.
- Presidio Analyzer does not use templates and external recognizer store.
- Image Redactor (formerly presidio-image-anonymizer) is in early beta and is Python based and pip installable.
- Other services are deprecated and may be migrated over time to V2 with the help of the community.
-
Improved documentation, sample code, and build workflows.
-
Format-Preserving Encryption replaced with Advanced Encryption Standard (AES)
V1 Availability
Version V1 (legacy) is still available for download. To continue using the previous version: - For docker containers, use tag=v1 - For python packages, download version < 2 (e.g. pip install presidio-analyzer==0.95)
Note
The legacy V1 code base will continue to be available under branch V1 but will no longer be officially supported.
API Changes
The move from gRPC to HTTP-based APIs included changes to the API requests.
-
Changed payload format – moving from structured objects to JSON.
-
Removed templates from the API, including flattening the JSON structure.
-
Using snake_case instead of camelCase .
Below is a detailed outline of all changes made to the Analyzer and Anonymizer.
Analyzer API Changes
Legacy json request (gRPC)
{
"text": "My phone number is 212-555-5555",
"AnalyzeTemplateId": "1234",
"AnalyzeTemplate": {
"Fields": [
{
"Name": "PHONE_NUMBER",
"MinScore": "0.5"
}
],
"AllFields": true,
"Description": "template description",
"CreateTime": "template creation time",
"ModifiedTime": "template modification time",
"Language": "fr",
"ResultsScoreThreshold": 0.5
}
}
V2 json request (HTTP)
{
"text": "My phone number is 212-555-5555",
"entities": ["PHONE_NUMBER"],
"language": "en",
"correlation_id": "213",
"score_threshold": 0.5,
"trace": true,
"return_decision_process": true
}
Anonymizer API Changes
Legacy json request (gRPC)
{
"text": "hello world, my name is Jane Doe. My number is: 034453334",
"template": {
"description": "DEPRECATED",
"create_time": "DEPRECATED",
"modified_time": "DEPRECATED",
"default_transformation": {
"replace_value": {...},
"redact_value": {...},
"hash_value": {...},
"mask_value": {...},
"fpe_value": {...}
},
"field_type_transformations": [
{
"fields": [
{
"name": "FIRST_NAME",
"min_score": "0.2"
}
],
"transformation": {
"replace_value": {...},
"redact_value": {...},
"hash_value": {...},
"mask_value": {...},
"fpe_value": {...}
}
}
],
"analyze_results": [
{
"text": "Jane",
"field": {
"name": "FIRST_NAME",
"min_score": "0.5"
},
"location": {
"start": 24,
"end": 32,
"length": 6
},
"score": 0.8
}
]
}
}
V2 json request (HTTP)
{
"text": "hello world, my name is Jane Doe. My number is: 034453334",
"anonymizers": {
"DEFAULT": {
"type": "replace",
"new_value": "val"
},
"PHONE_NUMBER": {
"type": "mask",
"masking_char": "*",
"chars_to_mask": 4,
"from_end": true
}
},
"analyzer_results": [
{
"start": 24,
"end": 32,
"score": 0.8,
"entity_type": "NAME"
},
{
"start": 24,
"end": 28,
"score": 0.8,
"entity_type": "FIRST_NAME"
},
{
"start": 29,
"end": 32,
"score": 0.6,
"entity_type": "LAST_NAME"
},
{
"start": 48,
"end": 57,
"score": 0.95,
"entity_type": "PHONE_NUMBER"
}
]
}
Specific for each anonymization type:
Anonymization name | Legacy format (V1) | New json format (V2) |
---|---|---|
Replace | string newValue = 1; |
{ "new_value": "VALUE" } |
Redact | NONE | NONE |
Mask | string maskingCharacter = 1; |
{ |
Hash | NONE | {"hash_type": "VALUE"} |
FPE (now Encrypt) | string key = 3t6w9z$C&F)J@NcR; |
{"key": "3t6w9z$C&F)J@NcR"} |
Note
The V2 API is continuously evolving. please follow the change log for updates.