sail
/

scaling-vocab-3b-43k-overtrain

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SivilTaram commited on Jul 19, 2024

Commit

8004929

·

verified ·

1 Parent(s): 8fcd306

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ language:
 - en
 ---
-The pre-trained 3B model with the vocabulary size 43K in the paper Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies (https://huggingface.co/papers/2407.13623).  We investigate how vocabulary size
 impacts language model scaling law in this paper.
 Based on our approach, we predict the optimal vocabulary size for 3B model is about 43K.

 - en
 ---
+The pre-trained 3B model with the vocabulary size 43K in the paper [Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies](https://huggingface.co/papers/2407.13623).  We investigate how vocabulary size
 impacts language model scaling law in this paper.
 Based on our approach, we predict the optimal vocabulary size for 3B model is about 43K.