Skip to content

Presidio Analyzer

The Presidio analyzer is a Python based service for detecting PII entities in text.

During analysis, it runs a set of different PII Recognizers, each one in charge of detecting one or more PII entities using different mechanisms.

Presidio analyzer comes with a set of predefined recognizers, but can easily be extended with other types of custom recognizers. Predefined and custom recognizers leverage regex, Named Entity Recognition and other types of logic to detect PII in unstructured text.

Analyzer Design



Consider installing the Presidio python packages on a virtual environment like venv or conda.

To get started with Presidio-analyzer, download the package and the en_core_web_lg spaCy model:

pip install presidio-analyzer
python -m spacy download en_core_web_lg


This requires Docker to be installed. Download Docker.

# Download image from Dockerhub
docker pull

# Run the container with the default port
docker run -d -p 5002:3000

First, clone the Presidio repo. See here for instructions.

Then, build the presidio-analyzer container:

cd presidio-analyzer
docker build . -t presidio/presidio-analyzer

Getting started

Once the Presidio-analyzer package is installed, run this simple analysis script:

from presidio_analyzer import AnalyzerEngine

# Set up the engine, loads the NLP module (spaCy model by default) and other PII recognizers
analyzer = AnalyzerEngine()

# Call analyzer to get results
results = analyzer.analyze(text="My phone number is 212-555-5555",

You can run presidio analyzer as an http server using either python runtime or using a docker container.

Using docker container

cd presidio-analyzer
docker run -p 5002:3000 presidio-analyzer

Using python runtime


This requires the Presidio Github repository to be cloned.

cd presidio-analyzer
curl -d '{"text":"John Smith drivers license is AC432223", "language":"en"}' -H "Content-Type: application/json" -X POST http://localhost:3000/analyze

Creating PII recognizers

Presidio analyzer can be easily extended to support additional PII entities. See this tutorial on adding new PII recognizers for more information.

Multi-language support

Presidio can be used to detect PII entities in multiple languages. Refer to the multi-language support for more information.

Outputting the analyzer decision process

Presidio analyzer has a built in mechanism for tracing each decision made. This can be useful when attempting to understand a specific PII detection. For more info, see the decision process documentation.

Supported entities

For a list of the current supported entities: Supported entities.

API reference

Follow the API Spec for the Analyzer REST API reference details and Analyzer Python API for Python API reference


Samples illustrating the usage of the Presidio Analyzer can be found in the Python samples.