Brainsmith¶
Compile Neural Networks to FPGA Accelerators¶
Brainsmith is an end-to-end compiler to transform ONNX models into dataflow accelerators for FPGAs. Through design space exploration, it evaluates hardware configurations to find the optimal configuration for your use case.
Automated RTL generation from ONNX models. Design space exploration to identify optimal configurations.
Key Features¶
-
Automatic Design Space Exploration
Navigate parallelization factors, resource allocation, and architectural choices. Explore multiple configurations to identify promising designs.
-
Schema-Driven Kernel Development
Define hardware semantics declaratively. Validation, design space construction, and interface generation are derived from schema definitions.
-
Synthesizable RTL Generation
Generate Verilog/VHDL with standard AXI-Stream interfaces. Compatible with Vivado IP Integrator workflows.
-
Growing Kernel Library
Built-in support for MatMul, LayerNorm, Softmax, and other common operations. Extensible architecture for adding custom kernels.
-
Performance Estimation
Resource estimation, cycle-accurate simulation support, and throughput analysis. Evaluate design tradeoffs before synthesis.
-
Multi-Layer Offload
Scale to large models with constant FPGA resources. Stream weights from external memory to process arbitrarily deep networks without increasing hardware footprint.
Basic Usage¶
Generate an accelerator with a single command:
# Design space exploration and RTL generation
smith model.onnx blueprint.yaml
# Output: RTL + performance estimates + resource reports
Example: BERT Accelerator¶
# blueprint.yaml - Define your design space
name: "BERT Accelerator"
clock_ns: 5.0 # 200MHz target
design_space:
kernels:
- MVAU # Matrix-vector operations
- LayerNorm # Layer normalization
- Softmax # Attention softmax
steps:
- "streamline" # Graph optimization
- "infer_kernels" # Hardware kernel mapping
- "specialize_layers" # Backend selection
- "dataflow_partition" # Multi-layer offload
Run design space exploration:
Results include:
- Synthesizable RTL in
results/stitched_ip/ - Performance estimates in
results/report/estimate_reports.json - Detailed build logs for debugging
The example targets V80 platform using Vivado 2024.2 and is compatible with Xilinx Zynq/Ultrascale+ platforms.
See examples/bert for full implementation
Open Source & Collaborative¶
Brainsmith is MIT-licensed and builds upon a foundation of proven open-source tools:
- FINN - Dataflow compiler for quantized neural networks
- QONNX - Quantized ONNX representation
- Brevitas - PyTorch quantization library
Brainsmith extends FINN with automated design space exploration, blueprint inheritance, and a schema-driven kernel system. FINN provides the low-level RTL generation and QONNX transformations.
Developed through collaboration between Microsoft and AMD.
License: MIT - see LICENSE
Community & Support¶
- Feature Roadmap - See what's planned and in progress
- GitHub Issues - Report bugs or request features
- GitHub Discussions - Ask questions and share experiences

