Bioacoustics¶

PyTorchWildlife's bioacoustics module provides training, inference, and dataset preparation for audio classification. The module lives at PW_Bioacoustics/ and builds on core APIs in PytorchWildlife.data.bioacoustics and PytorchWildlife.models.bioacoustics.

What's included¶

CLI scripts for dataset preparation (prepare_dataset.py), training (train.py), and inference (inference.py)
ResNetClassifier — PyTorch Lightning module for spectrogram classification (binary and multiclass)
Mel-spectrogram preprocessing with optional GPU acceleration
Annotation readers (COCO-like JSON), including support for the PteroSet / Raven Pro format
MD_AudioBirds_V1 — a pre-trained bird classifier distributed as ONNX for direct inference

See the Bioacoustics model zoo for the released models.

Demo¶

The end-to-end notebook at PW_Bioacoustics/demo/bioacoustics_demo.ipynb walks through:

Data exploration — annotation counts, species distribution
Inference — run MD_AudioBirds_V1 on real recordings, visualise predictions vs. ground truth
Training — build COCO-style annotations, binary classification (target vs. noise), multiclass classification

It uses recordings from the PteroSet dataset.

Projects using this module¶

PteroSet — Machine-learning pipeline for detecting and classifying tropical bird vocalisations from passive acoustic monitoring, with leave-one-project-out cross-validation.
CookInlet_Belugas — Passive acoustic monitoring for endangered Cook Inlet beluga whales. A two-stage pipeline covering cetacean signal detection and multi-species classification (beluga, humpback, killer whale), plus an active-learning loop for domain adaptation.

Install¶

pip install PytorchWildlife
pip install librosa soundfile pyyaml torchmetrics

See the PW_Bioacoustics README for full configuration options, training arguments, and output formats.