Qwen/Qwen-14B-Chat

Despite the repo name, it's the chat version.

After the release of Mistral, I realized that Chinese models were underappreciated.

This monster needed 60 GB of peak memory for quantization.

Credits

阿里云 Qwen-14B-Chat

Usage

Start the interactive chat by running the linux command.

./main -m ./Qwen-14b-Q8_0.bin --tiktoken ./qwen.tiktoken -i 

仙女小可 Evaluation Results

MMLU GSM8K Humaneval MBPP
Qwen 14B Chat 64% 61% 32% 41%
LLama 2 13B 56% 34% 19% 35%
Phi 1.5 37% 40% 34% 38%
Code Llama 7B 37% 21% 31% 53%
Mistral 7B 60% 52% 31% 48%

Columns: English, Mathematics, Coding, Basic Python Programming Evaluation

馍馍的做法 Architecture

Layers 40
Heads 40
Embedding 5120
Vocabulary 151851
Sequence length 2048

Find me on

Sh-it-just-works and Patreon

Downloads last month
0
Inference Examples
Unable to determine this model's library. Check the docs .

Space using twodgirl/Qwen-14b-GGML 1