Skip to content

Brainsmith

Compile Neural Networks to FPGA Accelerators

Brainsmith is an end-to-end compiler to transform ONNX models into dataflow accelerators for FPGAs. Through design space exploration, it evaluates hardware configurations to find the optimal configuration for your use case.

ONNX Model

ONNX Graph Structure

Dataflow Core

Generated Dataflow Accelerator

Automated RTL generation from ONNX models. Design space exploration to identify optimal configurations.

Get Started View on GitHub

Key Features

  • Automatic Design Space Exploration


    Navigate parallelization factors, resource allocation, and architectural choices. Explore multiple configurations to identify promising designs.

  • Schema-Driven Kernel Development


    Define hardware semantics declaratively. Validation, design space construction, and interface generation are derived from schema definitions.

  • Synthesizable RTL Generation


    Generate Verilog/VHDL with standard AXI-Stream interfaces. Compatible with Vivado IP Integrator workflows.

  • Growing Kernel Library


    Built-in support for MatMul, LayerNorm, Softmax, and other common operations. Extensible architecture for adding custom kernels.

  • Performance Estimation


    Resource estimation, cycle-accurate simulation support, and throughput analysis. Evaluate design tradeoffs before synthesis.

  • Multi-Layer Offload


    Scale to large models with constant FPGA resources. Stream weights from external memory to process arbitrarily deep networks without increasing hardware footprint.

Basic Usage

Generate an accelerator with a single command:

# Design space exploration and RTL generation
smith model.onnx blueprint.yaml

# Output: RTL + performance estimates + resource reports

Example: BERT Accelerator

# blueprint.yaml - Define your design space
name: "BERT Accelerator"
clock_ns: 5.0  # 200MHz target

design_space:
  kernels:
    - MVAU           # Matrix-vector operations
    - LayerNorm      # Layer normalization
    - Softmax        # Attention softmax

  steps:
    - "streamline"           # Graph optimization
    - "infer_kernels"        # Hardware kernel mapping
    - "specialize_layers"    # Backend selection
    - "dataflow_partition"   # Multi-layer offload

Run design space exploration:

smith bert.onnx blueprint.yaml --output-dir ./results

Results include:

  • Synthesizable RTL in results/stitched_ip/
  • Performance estimates in results/report/estimate_reports.json
  • Detailed build logs for debugging

The example targets V80 platform using Vivado 2024.2 and is compatible with Xilinx Zynq/Ultrascale+ platforms.

See examples/bert for full implementation

Open Source & Collaborative

Brainsmith is MIT-licensed and builds upon a foundation of proven open-source tools:

  • FINN - Dataflow compiler for quantized neural networks
  • QONNX - Quantized ONNX representation
  • Brevitas - PyTorch quantization library

Brainsmith extends FINN with automated design space exploration, blueprint inheritance, and a schema-driven kernel system. FINN provides the low-level RTL generation and QONNX transformations.

Developed through collaboration between Microsoft and AMD.

License: MIT - see LICENSE

Community & Support