JosephusCheung
commited on
Commit
•
81168c7
1
Parent(s):
f351271
Update README.md
Browse files
README.md
CHANGED
@@ -42,6 +42,8 @@ For inference, it is essential to use Transformers version 4.31.0 or later.
|
|
42 |
|
43 |
The tokenizer's vocabulary of this model has been expanded to 39,424, introducing some common CJK characters. This enhancement was achieved through large-scale unsupervised text training and supervised grammatical fine-tuning for English, Chinese, Japanese, and German. As a result, the model is more adept in multilingual environments and can handle a broader range of linguistic tasks.
|
44 |
|
|
|
|
|
45 |
The model has undergone unsupervised training on a multimodal and multilingual image-text dataset, adopting the BLIP2 Q-Former trained on a larger foundational LLM Vicuna 13B. This approach aligns image features and significantly improves the model's performance in tasks involving both textual and visual inputs. (Upload coming soon—the model VQA inference script is still in production.)
|
46 |
|
47 |
The model has undergone a rough RLHF process, enabling it to output more helpful text responses. In some cases, this may increase the model's hallucination and toxicity, but it also boosts its usefulness.
|
|
|
42 |
|
43 |
The tokenizer's vocabulary of this model has been expanded to 39,424, introducing some common CJK characters. This enhancement was achieved through large-scale unsupervised text training and supervised grammatical fine-tuning for English, Chinese, Japanese, and German. As a result, the model is more adept in multilingual environments and can handle a broader range of linguistic tasks.
|
44 |
|
45 |
+
Now you can try this new tokenizer with this [Javascript based Webpage](https://huggingface.co/spaces/JosephusCheung/LL7M-JS-Tokenizer)
|
46 |
+
|
47 |
The model has undergone unsupervised training on a multimodal and multilingual image-text dataset, adopting the BLIP2 Q-Former trained on a larger foundational LLM Vicuna 13B. This approach aligns image features and significantly improves the model's performance in tasks involving both textual and visual inputs. (Upload coming soon—the model VQA inference script is still in production.)
|
48 |
|
49 |
The model has undergone a rough RLHF process, enabling it to output more helpful text responses. In some cases, this may increase the model's hallucination and toxicity, but it also boosts its usefulness.
|