FPGA-Whale: FPGA-Friendly NanoWhale Architecture

This repository contains an FPGA-friendly reimplementation of the HuggingFaceTB/nanowhale-100m-base model, designed for deployment on Xilinx FPGAs.

Architecture Changes

Based on literature research for FPGA deployment (QFX arxiv:2401.17544, BitNet b1.58 Reloaded arxiv:2407.09527):

Removed MoE → Dense SwiGLU FFN (no sparse routing, deterministic memory access)
Removed Hyper-Connections → Standard residual connections (simpler dataflow)
BitLinear layers → Ternary weights {-1, 0, +1} with INT8 activations
- Multiplication-free inference on FPGA (ternary × int8 = add/sub/nop)
- Quantization-aware training with straight-through estimator
LayerNorm instead of RMSNorm → More FPGA-friendly normalization
Kept MLA → Low-rank Q projection for KV-cache efficiency

Training

Knowledge distillation from the original NanoWhale teacher model with QAT on FineWeb-Edu dataset.

python train.py \
  --teacher_model HuggingFaceTB/nanowhale-100m-base \
  --hub_model_id hakatu/fpga-whale-100m \
  --distill_alpha 0.7 \
  --temperature 2.0 \
  --num_train_samples 50000 \
  --max_seq_length 512 \
  --bf16

FPGA Deployment Path

Train this model with QAT to convergence
Extract ternary weights and INT8 activation scales
Export to ONNX / hls4ml format
Synthesize with Xilinx Vitis HLS / Vivado

References

Dai et al., "Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs", arxiv:2401.17544
Nielsen & Schneider-Kamp, "BitNet b1.58 Reloaded", arxiv:2407.09527
NanoWhale: https://huggingface.co/HuggingFaceTB/nanowhale-100m-base

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for hakatu/fpga-whale-training

BitNet b1.58 Reloaded: State-of-the-art Performance Also on Smaller Networks

Paper • 2407.09527 • Published Jun 24, 2024

Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs

Paper • 2401.17544 • Published Jan 31, 2024 • 1