|
--- |
|
language: |
|
- ja |
|
- en |
|
- zh |
|
license: apache-2.0 |
|
model-index: |
|
- name: laser-polyglot-4x7b |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: acc_norm |
|
value: 64.16 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: acc_norm |
|
value: 84.98 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 63.88 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: mc2 |
|
value: 55.47 |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 77.82 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 48.45 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b |
|
name: Open LLM Leaderboard |
|
--- |
|
# Polyglot-4x7b-24b |
|
|
|
![polyglot](polyglot.png) |
|
|
|
Polyglot-4x7b is a Mixture of Experts approach to a multilingual model. |
|
|
|
This project is an experiment to see if each expert can be of a different language. The answer is yes. |
|
|
|
The model is a merge of models that are capable of Chinese and Japanese output. |
|
|
|
+ teknium/OpenHermes-2.5-Mistral-7B |
|
+ oshizo/japanese-e5-mistral-7b_slerp |
|
+ cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser |
|
+ s3nh/Mistral-7B-Evol-Instruct-Chinese |
|
|
|
TODO: |
|
1. [] polyglot tokenizer |
|
|
|
## Other polyglot models |
|
|
|
+ [macadeliccc/Polyglot-8x7b-v0.1](https://huggingface.co/macadeliccc/Polyglot-8x7b-v0.1) (adds 3 more languages) |
|
# Code Example |
|
|
|
Inference [Colab](https://colab.research.google.com/drive/1tYSb63IKZDsiQ5BIJU8Oc92phxugAmB3?usp=sharing) |
|
Live demo available on [Spaces](https://huggingface.co/spaces/macadeliccc/polyglot-4x7b-chat?logs=build) |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
def generate_response(prompt): |
|
""" |
|
Generate a response from the model based on the input prompt. |
|
|
|
Args: |
|
prompt (str): Prompt for the model. |
|
|
|
Returns: |
|
str: The generated response from the model. |
|
""" |
|
# Tokenize the input prompt |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
|
# Generate output tokens |
|
outputs = model.generate(**inputs, max_new_tokens=256, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id) |
|
|
|
# Decode the generated tokens to a string |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
return response |
|
|
|
# Load the model and tokenizer |
|
model_id = "macadeliccc/laser-polyglot-4x7b" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True) |
|
|
|
# Example prompts in different languages |
|
english_prompt = "Write a quicksort algorithm in python" |
|
chinese_prompt = "用Python写一个快速排序算法" |
|
japanese_prompt = "Pythonでクイックソートアルゴリズムを書いてください" |
|
|
|
# Generate and print responses for each language |
|
print("English Response:") |
|
print(generate_response(english_prompt), "\n") |
|
|
|
print("Chinese Response:") |
|
print(generate_response(chinese_prompt), "\n") |
|
|
|
print("Japanese Response:") |
|
print(generate_response(japanese_prompt), "\n") |
|
|
|
``` |
|
|
|
## Example Output |
|
English: |
|
|
|
Write a quicksort algorithm in python. |
|
|
|
```python |
|
def quicksort(arr): |
|
if len(arr) <= 1: |
|
return arr |
|
else: |
|
pivot = arr[0] |
|
less = [i for i in arr[1:] if i <= pivot] |
|
greater = [i for i in arr[1:] if i > pivot] |
|
return quicksort(less) + [pivot] + quicksort(greater) |
|
|
|
arr = [5, 2, 9, 1, 5, 7, 4, 8, 6, 3] |
|
print(quicksort(arr)) |
|
``` |
|
|
|
This is a simple implementation of the quicksort algorithm in python. The function `quicksort` takes an array as input and returns a sorted array. The algorithm works by selecting a pivot element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The process is then repeated recursively on the sub-arrays until the entire array is sorted. |
|
|
|
Chinese Response: |
|
|
|
用Python写一个快速排序算法 |
|
|
|
```python |
|
def quick_sort(arr): |
|
if len(arr) <= 1: |
|
return arr |
|
else: |
|
pivot = arr[0] |
|
less = [i for i in arr[1:] if i <= pivot] |
|
greater = [i for i in arr[1:] if i > pivot] |
|
return quick_sort(less) + [pivot] + quick_sort(greater) |
|
|
|
arr = [3, 5, 2, 1, 4, 6, 8, 7] |
|
print(quick_sort(arr)) |
|
``` |
|
这个程序的时间复杂度为O(nlogn),空间复杂度为O(n)。 |
|
|
|
Japanese Response: |
|
|
|
Pythonでクイックソートアルゴリズムを書いてください。 |
|
|
|
```python |
|
def quicksort(arr): |
|
if len(arr) <= 1: |
|
return arr |
|
pivot = arr[0] |
|
left = [x for x in arr[1:] if x < pivot] |
|
right = [x for x in arr[1:] if x >= pivot] |
|
return quicksort(left) + [pivot] + quicksort(right) |
|
|
|
print(quicksort([3,6,8,10,1,5,9,2,4,7])) |
|
``` |
|
|
|
このコードはクイックソートアルゴリズムを実装しています。クイックソートは一種の分割と conquers アルゴリズムで、配列を分割し、それぞれの部分配列を再帰的にソートします。 |
|
|
|
この実装では、配列の最初の要素をピボットとして使用します。そして、配列を2つの |
|
|
|
|
|
|
|
# Evaluations |
|
|
|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr| |
|
|-------------|-------|------|-----:|--------|-----:|---|-----:| |
|
|arc_challenge|Yaml |none | 0|acc |0.5495|± |0.0145| |
|
| | |none | 0|acc_norm|0.5794|± |0.0144| |
|
|arc_easy |Yaml |none | 0|acc |0.8304|± |0.0077| |
|
| | |none | 0|acc_norm|0.8068|± |0.0081| |
|
|boolq |Yaml |none | 0|acc |0.8749|± |0.0058| |
|
|hellaswag |Yaml |none | 0|acc |0.6276|± |0.0048| |
|
| | |none | 0|acc_norm|0.8157|± |0.0039| |
|
|openbookqa |Yaml |none | 0|acc |0.3180|± |0.0208| |
|
| | |none | 0|acc_norm|0.4460|± |0.0223| |
|
|piqa |Yaml |none | 0|acc |0.8139|± |0.0091| |
|
| | |none | 0|acc_norm|0.8237|± |0.0089| |
|
|winogrande |Yaml |none | 0|acc |0.7419|± |0.0123| |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_macadeliccc__laser-polyglot-4x7b) |
|
|
|
| Metric |Value| |
|
|---------------------------------|----:| |
|
|Avg. |65.79| |
|
|AI2 Reasoning Challenge (25-Shot)|64.16| |
|
|HellaSwag (10-Shot) |84.98| |
|
|MMLU (5-Shot) |63.88| |
|
|TruthfulQA (0-shot) |55.47| |
|
|Winogrande (5-shot) |77.82| |
|
|GSM8k (5-shot) |48.45| |
|
|
|
|