File size: 1,102 Bytes
900776e 030a6e8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
language:
- zh
- en
pipeline_tag: text-generation
---
# Qwen/Qwen-14B-Chat
Despite the repo name, it's the chat version.
After the release of Mistral, I realized that Chinese models were underappreciated.
This monster needed 60 GB of peak memory for quantization.
## Credits
[阿里云 Qwen-14B-Chat](https://huggingface.co/Qwen/Qwen-14B-Chat)
## Usage
Start the interactive chat by running the linux command.
```
./main -m ./Qwen-14b-Q8_0.bin --tiktoken ./qwen.tiktoken -i
```
## 仙女小可 Evaluation Results
| | MMLU | GSM8K | Humaneval | MBPP |
|--|--|--|--|--|
| Qwen 14B Chat | 64% | 61% | 32% | 41% |
| LLama 2 13B | 56% | 34% | 19% | 35% |
| Phi 1.5 | 37% | 40% | 34% | 38% |
| Code Llama 7B | 37% | 21% | 31% | 53% |
| Mistral 7B | 60% | 52% | 31% | 48% |
**Columns:** English, Mathematics, Coding, Basic Python Programming Evaluation
## 馍馍的做法 Architecture
| | |
|--|--|
| Layers | 40 |
| Heads | 40 |
| Embedding | 5120 |
| Vocabulary | 151851 |
| Sequence length | 2048 |
## Find me on
[Sh-it-just-works](https://sh.itjust.works/c/localllama) and Patreon |