File size: 8,740 Bytes
eb12945
3402a47
 
 
 
f7d93c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eb12945
884b37c
25ab601
b171572
 
479296a
 
 
25ab601
 
 
 
 
 
 
1ff8d2c
a1e1c15
 
1ff8d2c
55d4e28
 
 
1ff8d2c
 
3aeb07c
22ce472
3aeb07c
1ff8d2c
479296a
1ff8d2c
479296a
1ff8d2c
479296a
1ff8d2c
 
 
 
 
 
 
479296a
 
 
 
 
1ff8d2c
 
479296a
1ff8d2c
 
 
479296a
 
 
 
 
1ff8d2c
 
 
 
 
 
 
479296a
1ff8d2c
 
479296a
1ff8d2c
 
479296a
 
53aa454
 
479296a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53aa454
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f7d93c3
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
---
language:
- ja
- en
- zh
license: apache-2.0
model-index:
- name: laser-polyglot-4x7b
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 64.16
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 84.98
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 63.88
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 55.47
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 77.82
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 48.45
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
      name: Open LLM Leaderboard
---
# Polyglot-4x7b-24b

![polyglot](polyglot.png)

Polyglot-4x7b is a Mixture of Experts approach to a multilingual model.

This project is an experiment to see if each expert can be of a different language. The answer is yes.

The model is a merge of models that are capable of Chinese and Japanese output.

+ teknium/OpenHermes-2.5-Mistral-7B
+ oshizo/japanese-e5-mistral-7b_slerp
+ cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
+ s3nh/Mistral-7B-Evol-Instruct-Chinese

TODO:
1. [] polyglot tokenizer

## Other polyglot models

+ [macadeliccc/Polyglot-8x7b-v0.1](https://huggingface.co/macadeliccc/Polyglot-8x7b-v0.1) (adds 3 more languages)
# Code Example 

Inference [Colab](https://colab.research.google.com/drive/1tYSb63IKZDsiQ5BIJU8Oc92phxugAmB3?usp=sharing)
Live demo available on [Spaces](https://huggingface.co/spaces/macadeliccc/polyglot-4x7b-chat?logs=build)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

def generate_response(prompt):
    """
    Generate a response from the model based on the input prompt.

    Args:
    prompt (str): Prompt for the model.

    Returns:
    str: The generated response from the model.
    """
    # Tokenize the input prompt
    inputs = tokenizer(prompt, return_tensors="pt")

    # Generate output tokens
    outputs = model.generate(**inputs, max_new_tokens=256, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)

    # Decode the generated tokens to a string
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    return response

# Load the model and tokenizer
model_id = "macadeliccc/laser-polyglot-4x7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)

# Example prompts in different languages
english_prompt = "Write a quicksort algorithm in python"
chinese_prompt = "用Python写一个快速排序算法"
japanese_prompt = "Pythonでクイックソートアルゴリズムを書いてください"

# Generate and print responses for each language
print("English Response:")
print(generate_response(english_prompt), "\n")

print("Chinese Response:")
print(generate_response(chinese_prompt), "\n")

print("Japanese Response:")
print(generate_response(japanese_prompt), "\n")

```

## Example Output
English: 

  Write a quicksort algorithm in python.
  
  ```python
  def quicksort(arr):
      if len(arr) <= 1:
          return arr
      else:
          pivot = arr[0]
          less = [i for i in arr[1:] if i <= pivot]
          greater = [i for i in arr[1:] if i > pivot]
          return quicksort(less) + [pivot] + quicksort(greater)
  
  arr = [5, 2, 9, 1, 5, 7, 4, 8, 6, 3]
  print(quicksort(arr))
  ```
  
  This is a simple implementation of the quicksort algorithm in python. The function `quicksort` takes an array as input and returns a sorted array. The algorithm works by selecting a pivot element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The process is then repeated recursively on the sub-arrays until the entire array is sorted. 

Chinese Response:

  用Python写一个快速排序算法
  
  ```python
  def quick_sort(arr):
      if len(arr) <= 1:
          return arr
      else:
          pivot = arr[0]
          less = [i for i in arr[1:] if i <= pivot]
          greater = [i for i in arr[1:] if i > pivot]
          return quick_sort(less) + [pivot] + quick_sort(greater)
  
  arr = [3, 5, 2, 1, 4, 6, 8, 7]
  print(quick_sort(arr))
  ```
  这个程序的时间复杂度为O(nlogn),空间复杂度为O(n)。 

Japanese Response: 

  Pythonでクイックソートアルゴリズムを書いてください。

  ```python
  def quicksort(arr):
      if len(arr) <= 1:
          return arr
      pivot = arr[0]
      left = [x for x in arr[1:] if x < pivot]
      right = [x for x in arr[1:] if x >= pivot]
      return quicksort(left) + [pivot] + quicksort(right)
  
  print(quicksort([3,6,8,10,1,5,9,2,4,7]))
  ```
  
  このコードはクイックソートアルゴリズムを実装しています。クイックソートは一種の分割と conquers アルゴリズムで、配列を分割し、それぞれの部分配列を再帰的にソートします。
  
  この実装では、配列の最初の要素をピボットとして使用します。そして、配列を2つの 



# Evaluations

|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
|-------------|-------|------|-----:|--------|-----:|---|-----:|
|arc_challenge|Yaml   |none  |     0|acc     |0.5495|±  |0.0145|
|             |       |none  |     0|acc_norm|0.5794|±  |0.0144|
|arc_easy     |Yaml   |none  |     0|acc     |0.8304|±  |0.0077|
|             |       |none  |     0|acc_norm|0.8068|±  |0.0081|
|boolq        |Yaml   |none  |     0|acc     |0.8749|±  |0.0058|
|hellaswag    |Yaml   |none  |     0|acc     |0.6276|±  |0.0048|
|             |       |none  |     0|acc_norm|0.8157|±  |0.0039|
|openbookqa   |Yaml   |none  |     0|acc     |0.3180|±  |0.0208|
|             |       |none  |     0|acc_norm|0.4460|±  |0.0223|
|piqa         |Yaml   |none  |     0|acc     |0.8139|±  |0.0091|
|             |       |none  |     0|acc_norm|0.8237|±  |0.0089|
|winogrande   |Yaml   |none  |     0|acc     |0.7419|±  |0.0123|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_macadeliccc__laser-polyglot-4x7b)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |65.79|
|AI2 Reasoning Challenge (25-Shot)|64.16|
|HellaSwag (10-Shot)              |84.98|
|MMLU (5-Shot)                    |63.88|
|TruthfulQA (0-shot)              |55.47|
|Winogrande (5-shot)              |77.82|
|GSM8k (5-shot)                   |48.45|