p208p2002
/

llama-chinese-81M

Text Generation

text-generation-inference

Model card Files Files and versions Community

LLaMA Chinese 81M

一個小型中英文(雙語)預訓練語言模型。

Training Dataset

中文維基百科(20230601)
英文維基百科(20230601)

Tokenizer

使用重新在中英文語料上訓練的 BPE Tokenizer，擁有較佳的分詞效果與邊解碼效率。

https://github.com/p208p2002/BPE-tokenizer-from-zh-wiki

Downloads last month: 715

Safetensors

Model size

81M params

Tensor type

F32

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train p208p2002/llama-chinese-81M

Collection including p208p2002/llama-chinese-81M

LLaMA-zhtw

6 items • Updated Jun 11, 2024