venkatasg
/

lil-bevo-x

Inference Endpoints

Model card Files Files and versions Community

lil-bevo-x / README.md

venkatasg's picture

Update README.md

ad6f442 over 1 year ago

|

669 Bytes

	---
	license: mit
	language:
	- en
	tags:
	- babylm
	---

	# Lil-Bevo-X

	Lil-Bevo-X is UT Austin's submission to the BabyLM challenge, specifically the strict track.

	[Link to GitHub Repo](https://github.com/venkatasg/Lil-Bevo)

	## TLDR:
	- Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 32k.
	- `deberta-base-v3` trained on mixture of MAESTRO and 100M tokens for 3 epochs.
	- Model continues training for 100,000 steps with 128 sequence length.
	- Model continues training for 65,000 steps with 512 sequence length.
	- Model is trained with targeted linguistic masking for 1 epoch.


	This README will be updated with more details soon.