Skip to content

Bioacoustics

PyTorchWildlife's bioacoustics module provides training, inference, and dataset preparation for audio classification. The module lives at PW_Bioacoustics/ and builds on core APIs in PytorchWildlife.data.bioacoustics and PytorchWildlife.models.bioacoustics.

What's included

  • CLI scripts for dataset preparation (prepare_dataset.py), training (train.py), and inference (inference.py)
  • ResNetClassifier — PyTorch Lightning module for spectrogram classification (binary and multiclass)
  • Mel-spectrogram preprocessing with optional GPU acceleration
  • Annotation readers (COCO-like JSON), including support for the PteroSet / Raven Pro format
  • MD_AudioBirds_V1 — a pre-trained bird classifier distributed as ONNX for direct inference

See the Bioacoustics model zoo for the released models.

Demo

The end-to-end notebook at PW_Bioacoustics/demo/bioacoustics_demo.ipynb walks through:

  1. Data exploration — annotation counts, species distribution
  2. Inference — run MD_AudioBirds_V1 on real recordings, visualise predictions vs. ground truth
  3. Training — build COCO-style annotations, binary classification (target vs. noise), multiclass classification

It uses recordings from the PteroSet dataset.

Projects using this module

  • PteroSet — Machine-learning pipeline for detecting and classifying tropical bird vocalisations from passive acoustic monitoring, with leave-one-project-out cross-validation.
  • CookInlet_Belugas — Passive acoustic monitoring for endangered Cook Inlet beluga whales. A two-stage pipeline covering cetacean signal detection and multi-species classification (beluga, humpback, killer whale), plus an active-learning loop for domain adaptation.

Install

pip install PytorchWildlife
pip install librosa soundfile pyyaml torchmetrics

See the PW_Bioacoustics README for full configuration options, training arguments, and output formats.