Lil-Bevo

Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the strict-small track.

Link to GitHub Repo

TLDR:

  • Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.

  • deberta-small-v3 trained on mixture of MAESTRO and 10M tokens for 5 epochs.

  • Model continues training for 50 epochs on 10M tokens with sequence length of 128.

  • Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512.

    This README will be updated with more details soon.

Downloads last month
21
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including venkatasg/lil-bevo