This is a model that is assumed to perform well, but may require more testing and user feedback. Be aware, only models featured within the GUI of GPT4All, are curated and officially supported by Nomic. Use at your own risk.

About

Static quants of https://huggingface.co/01-ai/Yi-1.5-9B-Chat-16K
Quantized by ThiloteE with llama.cpp commit c3776ca

These quants were created with a customized configuration that have been proven to not cause visible end of string (eos) tokens during inference with GPT4All. The config.json, generation_config.json and tokenizer_config.json differ from the original configuration as can be found in the original model's repository at the time of creation of these quants.

Prompt Template (for GPT4All)

System Prompt:

<|im_start|>system
Below is an instruction that describes a task. Write a response that appropriately completes the request.<|im_end|>

Chat Template:

<|im_start|>user
%1<|im_end|>
<|im_start|>assistant
%2<|im_end|>

Do not miss the newlines at the end!

Context Length

16384

Provided Quants

Link	Type	Size/GB	Notes
GGUF	Q4_0	4.9	fast, recommended
GGUF	f16	17.2	16 bpw, overkill

About GGUF

If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi-part files.

Here is a handy graph by ikawrakow comparing some quant types (lower is better):

And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

Thanks

I thank Mradermacher and TheBloke for Inspiration to this model card and their contributions to open source. I thank 3Simplex for everything. Shoutout to the GPT4All and llama.cpp communities :-)

Original Model card:

license: apache-2.0

🐙 GitHub • 👾 Discord • 🐤 Twitter • 💬 WeChat
📝 Paper • 💪 Tech Blog • 🙌 FAQ • 📗 Learning Hub

Intro

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Model	Context Length	Pre-trained Tokens
Yi-1.5	4K, 16K, 32K	3.6T

Models

Chat models

Name	Download
Yi-1.5-34B-Chat	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-34B-Chat-16K	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-9B-Chat	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-9B-Chat-16K	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-6B-Chat	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel

Base models

Name	Download
Yi-1.5-34B	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-34B-32K	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-9B	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-9B-32K	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel
Yi-1.5-6B	• 🤗 Hugging Face • 🤖 ModelScope • 🟣 wisemodel

Benchmarks

Chat models

Yi-1.5-34B-Chat is on par with or excels beyond larger models in most benchmarks.

Yi-1.5-9B-Chat is the top performer among similarly sized open-source models.
Base models

Yi-1.5-34B is on par with or excels beyond larger models in some benchmarks.

Yi-1.5-9B is the top performer among similarly sized open-source models.