File size: 644 Bytes
2b58279 61a0aba 2b58279 61a0aba ff8c391 7cc2d7f 61a0aba |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
---
license: mit
language:
- en
tags:
- babylm
---
# Lil-Bevo
Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the *strict-small* track.
[Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo)
## TLDR:
- Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.
- `deberta-small-v3` trained on mixture of MAESTRO and 10M tokens for 5 epochs.
- Model continues training for 50 epochs on 10M tokens with sequence length of 128.
- Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512.
This README will be updated with more details soon. |