File size: 669 Bytes
533c7c6 de9e48f ad6f442 533c7c6 de9e48f ad6f442 de9e48f ad6f442 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
---
license: mit
language:
- en
tags:
- babylm
---
# Lil-Bevo-X
Lil-Bevo-X is UT Austin's submission to the BabyLM challenge, specifically the *strict* track.
[Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo)
## TLDR:
- Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 32k.
- `deberta-base-v3` trained on mixture of MAESTRO and 100M tokens for 3 epochs.
- Model continues training for 100,000 steps with 128 sequence length.
- Model continues training for 65,000 steps with 512 sequence length.
- Model is trained with targeted linguistic masking for 1 epoch.
This README will be updated with more details soon. |