llm-jp
/

llm-jp-13b-v1.0-mdsfmt

Text Generation

Model card Files Files and versions Community

losyer8 commited on Oct 20, 2023

Commit

711c5f8

•

1 Parent(s): b70efd6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -63,7 +63,7 @@ are models further trained by additional (potentially) high-quality 27B tokens d
 ## Tokenizer
 The tokenizer of this model is based on [huggingface/tokenizers](https://github.com/huggingface/tokenizers) Unigram byte-fallback model.
 The vocabulary entries were converted from [`llm-jp-tokenizer v2.1 (50k)`](https://github.com/llm-jp/llm-jp-tokenizer/releases/tag/v2.1).
-Please refer to the [README.md](https://github.com/llm-jp/llm-jp-tokenizer) of `llm-ja-tokenizer` for details on the vocabulary construction procedure.
 - **Model:** Hugging Face Fast Tokenizer using Unigram byte-fallback model which requires `tokenizers>=0.14.0`
 - **Training algorithm:** SentencePiece Unigram byte-fallback
 - **Training data:** A subset of the datasets for model pre-training

 ## Tokenizer
 The tokenizer of this model is based on [huggingface/tokenizers](https://github.com/huggingface/tokenizers) Unigram byte-fallback model.
 The vocabulary entries were converted from [`llm-jp-tokenizer v2.1 (50k)`](https://github.com/llm-jp/llm-jp-tokenizer/releases/tag/v2.1).
+Please refer to [README.md](https://github.com/llm-jp/llm-jp-tokenizer) of `llm-ja-tokenizer` for details on the vocabulary construction procedure.
 - **Model:** Hugging Face Fast Tokenizer using Unigram byte-fallback model which requires `tokenizers>=0.14.0`
 - **Training algorithm:** SentencePiece Unigram byte-fallback
 - **Training data:** A subset of the datasets for model pre-training