Skip to content

GPU Acceleration

Presidio supports GPU acceleration for NLP models, which can significantly improve performance when processing large volumes of text. GPU is automatically detected and used when available—no code changes required.

Prerequisites

Hardware Requirements

Hardware Requirement
NVIDIA GPU CUDA Toolkit 11.x or 12.x
Apple Silicon MPS (Metal Performance Shaders) on M1/M2/M3
CPU Automatic fallback when GPU is unavailable

Software Dependencies

Install the appropriate CUDA library matching your CUDA version:

pip install "spacy[cuda12x]"  # For CUDA 12.x
# or
pip install "spacy[cuda11x]"  # For CUDA 11.x

Silent Fallback

If cupy is not installed or the version mismatches your CUDA driver, spaCy will silently fall back to CPU. Always verify your setup using the methods below.

No additional dependencies required. MPS is detected automatically.

Verifying GPU Usage

To check if GPU is available and being used:

import torch

if torch.cuda.is_available():
    print(f"GPU available: CUDA ({torch.cuda.get_device_name(0)})")
elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
    print("GPU available: MPS (Apple Silicon)")
else:
    print("Using CPU")

You can also monitor GPU usage during processing:

# For NVIDIA GPUs
watch -n 1 nvidia-smi

# For Apple Silicon
sudo powermetrics --samplers gpu_power

GPU-Enabled NLP Engines

Engine GPU Support
TransformersNlpEngine ✅ Full
GLiNER ✅ Full
Stanza ✅ Full
spaCy (transformer models) ✅ Full
spaCy (standard models) ⚠️ Not recommended

Standard spaCy Models

GPU is recommended for Transformers, Stanza, and GLiNER workloads. Standard spaCy models (e.g., en_core_web_lg) may be slower on GPU due to data transfer overhead.

Troubleshooting

GPU not detected

  1. Check CUDA is installed:

    nvidia-smi
    nvcc --version
    

  2. Verify cupy can see your GPU:

    import cupy as cp
    print(cp.cuda.runtime.getDeviceCount())  # Should be > 0
    

  3. Reinstall cupy with correct CUDA version:

    pip uninstall cupy cupy-cuda11x cupy-cuda12x
    pip install cupy-cuda12x  # Match your CUDA version
    

Out of memory errors

If you see RuntimeError: CUDA out of memory:

  • Use a smaller transformer model (e.g., dslim/bert-base-NER instead of larger models)
  • Split longer texts into smaller chunks before processing
  • Reduce the maximum sequence length if using Hugging Face transformers

CPU fallback

Presidio automatically uses CPU if:

  • No GPU is detected
  • CUDA or cupy are not installed
  • GPU initialization fails

This ensures code portability across different environments.

Additional Resources

Related Presidio documentation: