Model Card for Chinese-OpenELM-270M
Finetuned from apple/OpenELM-270M:
- Extended vocabulary from 32000 to 75873 with sentencepiece bpe trained on bigscience-data/roots_zh-tw_wikipedia and used average embedding to initialize the new embeddings.
- Continual pre-trained with a mix of bigscience-data/roots_zh-tw_wikipedia and bigscience-data/roots_en_wikipedia.
- Evaluation ppl = 1.6644828403646825 (split 3% training data as evaluation set)
- Downloads last month
- 17