Skip to content

Installation

Requirements

  • Python 3.9+
  • PyTorch 2.0+
  • CUDA (optional, but recommended for spectrogram generation and training)

Install

git clone https://github.com/microsoft/MegaDetector-Acoustic
cd MegaDetector-Acoustic
pip install -r requirements.txt

This installs the following dependencies:

Package Purpose
PytorchWildlife Core models, datasets, and spectrogram utilities
librosa Audio loading and feature extraction
soundfile Audio file I/O
torchaudio GPU-accelerated audio processing
pyyaml YAML configuration loading
torchmetrics Training metrics
pytorch-lightning Training loop and checkpointing
pandas / numpy Data manipulation

Verify

from PytorchWildlife.models.bioacoustics import ResNetClassifier
print("MegaDetector-Acoustic is ready.")

GPU Setup

MegaDetector-Acoustic uses GPU acceleration for mel spectrogram generation. Ensure CUDA is available:

import torch
print(torch.cuda.is_available())  # should print True on a CUDA-enabled machine

If running on CPU, spectrogram generation will fall back to CPU automatically (slower for large datasets).

Next Steps

  • Run the demo notebook for an end-to-end walkthrough
  • Copy template.yaml as a starting point for your domain configuration
  • See the README for full CLI usage