File size: 6,107 Bytes
b1a1351
 
 
 
38ec246
b1a1351
71ce048
 
 
 
 
 
 
 
b1a1351
71ce048
 
 
 
 
 
b1a1351
 
 
 
 
 
5f65b85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6123df8
e4ed3d4
6123df8
 
 
 
 
 
 
71ce048
5f65b85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71ce048
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
pipeline_tag: text-generation
---

# Model Card for Breeze-7B-Base-v0.1

Breeze-7B-Base-v0.1 is a 7-billion-parameter language model built from Mistral-7B and tailored for Traditional Chinese (TC).
This model expands the TC vocabulary (extra 30k TC tokens) based on the original Mistral-7B to better adapt to TC and improve inference speed, 
resulting in a doubling of the original tokenizer's inference speed.
To the best of our knowledge, this is the first work on vocabulary expansion in TC. 
This model uses 250GB of TC data for continued pre-training. 
Breeze-7B-Base-v0.1 performs well on both EN and TC benchmarks. 
This model outperforms Taiwan-LLM-7B-v2.1-base, Taiwan-LLM-13B-v2.0-base, and Yi-6B-Base on all TC benchmarks 
and is comparable with Mistral-7B-v0.1 on MMLU and MT-Bench in English.

*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*

## Features

- Expanding the vocabulary dictionary for Traditional Chinese from 32k to 62k vocabulary size 
- 8k context length

## Model Details
- **Finetuned from:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
- **Model type:** Causal decoder-only transformer language model
- **Language:** English and Traditional Chinese (zh-tw)

##  Base Model Performance

| Models                                              | TMMLU+ (ACC) | DRCD (EM) | MMLU (ACC) |
|-----------------------------------------------------|--------------|-----------|------------|
|                                                     | 5 shot       | 3 shot    | 5 shot     |
| MediaTek-Research/Breeze-7B-Base-v0.1               |              |           |            |
| mistralai/Mistral-7B-v0.1                           |              |           |            |
| yentinglin/Taiwan-LLM-7B-v2.1-base                  |              |           |            |
| yentinglin/Taiwan-LLM-13B-v2.0-base                 |              |           |            |
| 01-ai/Yi-6B                                         |              |           |            |
| 01-ai/Yi-34B                                        |              |           |            |
| Qwen/Qwen-7B                                        |              |           |            |
| Qwen/Qwen-14B                                       |              |           |            |

## Inference Performance

| Models                                                             | Speed (char/sec)  | Compression Ratio | Max Character Size |
|--------------------------------------------------------------------|-------------------|-------------------|--------------------|
| MediaTek-Research/Breeze-7B-Base-v0.1                              |                   |                   |                    |                    |
| mistralai/Mistral-7B-v0.1                                          |                   |                   |                    |
| yentinglin/Taiwan-LLM-7B-v2.1-base                                 |                   |                   |                    |
| yentinglin/Taiwan-LLM-13B-v2.0-base                                |                   |                   |                    |
| 01-ai/Yi-6B                                                        |                   |                   |                    |
| 01-ai/Yi-34B                                                       |                   |                   |                    |
| Qwen/Qwen-7B                                                       |                   |                   |                    |
| Qwen/Qwen-14B                                                      |                   |                   |                    |

##  Chat Model Performance

| Models                                              | TMMLU+ (ACC) | DRCD (EM) | MT-Bench-tw (Score) | MMLU (ACC) | MT-Bench (Score) |
|-----------------------------------------------------|--------------|-----------|---------------------|------------|------------------|
|                                                     | 5 shot       | 3 shot    | 0 shot              | 5 shot     | 0 shot           |
| MediaTek-Research/Breeze-7B-Instruct-v0.1           |              |           |                     |            |                  |
| mistralai/Mistral-7B-Instruct-v0.1                  |              |           |                     |            |                  |
| yentinglin/Taiwan-LLM-7B-v2.1-chat                  |              |           |                     |            |                  |
| yentinglin/Taiwan-LLM-13B-v2.0-chat                 |              |           |                     |            |                  |
| 01-ai/Yi-6B-Chat                                    |              |           |                     |            |                  |
| 01-ai/Yi-34B-Chat                                   |              |           |                     |            |                  |
| Qwen/Qwen-7B-Chat                                   |              |           |                     |            |                  |
| Qwen/Qwen-14B-Chat                                  |              |           |                     |            |                  |
| gpt-3.5-turbo-0613                                  |              | 76.30     |                     |            |                  |



## Use in Transformers

First install direct dependencies:
```
pip install transformers torch accelerate
```
If you want faster inference using flash-attention2, you need to install these dependencies:
```bash
pip install packaging ninja
pip install flash-attn
```
Then load the model in transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    model="MediaTek-Research/Breeze-7B-Base-v0.1",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    use_flash_attn_2=True # optional
)
```