Our model for the 2024 BabyLM challenge 100M words track.

To download and use this model the fla package has to be installed:

pip install -U git+https://github.com/sustcsonglin/flash-linear-attention
Downloads last month
6,365
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train PatrickHaller/hgrn2_pile_100m_distill_babylm

Collection including PatrickHaller/hgrn2_pile_100m_distill_babylm