Bioacoustics¶
PyTorchWildlife's bioacoustics module provides training, inference, and dataset preparation for audio classification. The module lives at PW_Bioacoustics/ and builds on core APIs in PytorchWildlife.data.bioacoustics and PytorchWildlife.models.bioacoustics.
What's included¶
- CLI scripts for dataset preparation (
prepare_dataset.py), training (train.py), and inference (inference.py) ResNetClassifier— PyTorch Lightning module for spectrogram classification (binary and multiclass)- Mel-spectrogram preprocessing with optional GPU acceleration
- Annotation readers (COCO-like JSON), including support for the PteroSet / Raven Pro format
MD_AudioBirds_V1— a pre-trained bird classifier distributed as ONNX for direct inference
See the Bioacoustics model zoo for the released models.
Demo¶
The end-to-end notebook at PW_Bioacoustics/demo/bioacoustics_demo.ipynb walks through:
- Data exploration — annotation counts, species distribution
- Inference — run
MD_AudioBirds_V1on real recordings, visualise predictions vs. ground truth - Training — build COCO-style annotations, binary classification (target vs. noise), multiclass classification
It uses recordings from the PteroSet dataset.
Projects using this module¶
- PteroSet — Machine-learning pipeline for detecting and classifying tropical bird vocalisations from passive acoustic monitoring, with leave-one-project-out cross-validation.
- CookInlet_Belugas — Passive acoustic monitoring for endangered Cook Inlet beluga whales. A two-stage pipeline covering cetacean signal detection and multi-species classification (beluga, humpback, killer whale), plus an active-learning loop for domain adaptation.
Install¶
See the PW_Bioacoustics README for full configuration options, training arguments, and output formats.