Brainsmith¶

Compile Neural Networks to FPGA Accelerators¶

Brainsmith is an end-to-end compiler to transform ONNX models into dataflow accelerators for FPGAs. Through design space exploration, it evaluates hardware configurations to find the optimal configuration for your use case.

ONNX Model

→

Dataflow Core

Automated RTL generation from ONNX models. Design space exploration to identify optimal configurations.

Get Started View on GitHub

Key Features¶

Automatic Design Space Exploration

Navigate parallelization factors, resource allocation, and architectural choices. Explore multiple configurations to identify promising designs.
Schema-Driven Kernel Development

Define hardware semantics declaratively. Validation, design space construction, and interface generation are derived from schema definitions.
Synthesizable RTL Generation

Generate Verilog/VHDL with standard AXI-Stream interfaces. Compatible with Vivado IP Integrator workflows.
Growing Kernel Library

Built-in support for MatMul, LayerNorm, Softmax, and other common operations. Extensible architecture for adding custom kernels.
Performance Estimation

Resource estimation, cycle-accurate simulation support, and throughput analysis. Evaluate design tradeoffs before synthesis.
Multi-Layer Offload

Scale to large models with constant FPGA resources. Stream weights from external memory to process arbitrarily deep networks without increasing hardware footprint.

Basic Usage¶

Generate an accelerator with a single command:

# Design space exploration and RTL generation
smith model.onnx blueprint.yaml

# Output: RTL + performance estimates + resource reports

Example: BERT Accelerator¶

# blueprint.yaml - Define your design space
name: "BERT Accelerator"
clock_ns: 5.0  # 200MHz target

design_space:
  kernels:
    - MVAU           # Matrix-vector operations
    - LayerNorm      # Layer normalization
    - Softmax        # Attention softmax

  steps:
    - "streamline"           # Graph optimization
    - "infer_kernels"        # Hardware kernel mapping
    - "specialize_layers"    # Backend selection
    - "dataflow_partition"   # Multi-layer offload

Run design space exploration:

smith bert.onnx blueprint.yaml --output-dir ./results

Results include:

Synthesizable RTL in results/stitched_ip/
Performance estimates in results/report/estimate_reports.json
Detailed build logs for debugging

The example targets V80 platform using Vivado 2024.2 and is compatible with Xilinx Zynq/Ultrascale+ platforms.

See examples/bert for full implementation

Open Source & Collaborative¶

Brainsmith is MIT-licensed and builds upon a foundation of proven open-source tools:

FINN - Dataflow compiler for quantized neural networks
QONNX - Quantized ONNX representation
Brevitas - PyTorch quantization library

Brainsmith extends FINN with automated design space exploration, blueprint inheritance, and a schema-driven kernel system. FINN provides the low-level RTL generation and QONNX transformations.

Developed through collaboration between Microsoft and AMD.

License: MIT - see LICENSE

Community & Support¶

Feature Roadmap - See what's planned and in progress
GitHub Issues - Report bugs or request features
GitHub Discussions - Ask questions and share experiences