laser-polyglot-4x7b / README.md
macadeliccc's picture
Adding Evaluation Results (#1)
00ff4d5 verified
|
raw
history blame
8.74 kB
---
language:
- ja
- en
- zh
license: apache-2.0
model-index:
- name: laser-polyglot-4x7b
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 64.16
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 84.98
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 63.88
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 55.47
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 77.82
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 48.45
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=macadeliccc/laser-polyglot-4x7b
name: Open LLM Leaderboard
---
# Polyglot-4x7b-24b
![polyglot](polyglot.png)
Polyglot-4x7b is a Mixture of Experts approach to a multilingual model.
This project is an experiment to see if each expert can be of a different language. The answer is yes.
The model is a merge of models that are capable of Chinese and Japanese output.
+ teknium/OpenHermes-2.5-Mistral-7B
+ oshizo/japanese-e5-mistral-7b_slerp
+ cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
+ s3nh/Mistral-7B-Evol-Instruct-Chinese
TODO:
1. [] polyglot tokenizer
## Other polyglot models
+ [macadeliccc/Polyglot-8x7b-v0.1](https://huggingface.co/macadeliccc/Polyglot-8x7b-v0.1) (adds 3 more languages)
# Code Example
Inference [Colab](https://colab.research.google.com/drive/1tYSb63IKZDsiQ5BIJU8Oc92phxugAmB3?usp=sharing)
Live demo available on [Spaces](https://huggingface.co/spaces/macadeliccc/polyglot-4x7b-chat?logs=build)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
def generate_response(prompt):
"""
Generate a response from the model based on the input prompt.
Args:
prompt (str): Prompt for the model.
Returns:
str: The generated response from the model.
"""
# Tokenize the input prompt
inputs = tokenizer(prompt, return_tensors="pt")
# Generate output tokens
outputs = model.generate(**inputs, max_new_tokens=256, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
# Decode the generated tokens to a string
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response
# Load the model and tokenizer
model_id = "macadeliccc/laser-polyglot-4x7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)
# Example prompts in different languages
english_prompt = "Write a quicksort algorithm in python"
chinese_prompt = "用Python写一个快速排序算法"
japanese_prompt = "Pythonでクイックソートアルゴリズムを書いてください"
# Generate and print responses for each language
print("English Response:")
print(generate_response(english_prompt), "\n")
print("Chinese Response:")
print(generate_response(chinese_prompt), "\n")
print("Japanese Response:")
print(generate_response(japanese_prompt), "\n")
```
## Example Output
English:
Write a quicksort algorithm in python.
```python
def quicksort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[0]
less = [i for i in arr[1:] if i <= pivot]
greater = [i for i in arr[1:] if i > pivot]
return quicksort(less) + [pivot] + quicksort(greater)
arr = [5, 2, 9, 1, 5, 7, 4, 8, 6, 3]
print(quicksort(arr))
```
This is a simple implementation of the quicksort algorithm in python. The function `quicksort` takes an array as input and returns a sorted array. The algorithm works by selecting a pivot element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The process is then repeated recursively on the sub-arrays until the entire array is sorted.
Chinese Response:
用Python写一个快速排序算法
```python
def quick_sort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[0]
less = [i for i in arr[1:] if i <= pivot]
greater = [i for i in arr[1:] if i > pivot]
return quick_sort(less) + [pivot] + quick_sort(greater)
arr = [3, 5, 2, 1, 4, 6, 8, 7]
print(quick_sort(arr))
```
这个程序的时间复杂度为O(nlogn),空间复杂度为O(n)。
Japanese Response:
Pythonでクイックソートアルゴリズムを書いてください。
```python
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
left = [x for x in arr[1:] if x < pivot]
right = [x for x in arr[1:] if x >= pivot]
return quicksort(left) + [pivot] + quicksort(right)
print(quicksort([3,6,8,10,1,5,9,2,4,7]))
```
このコードはクイックソートアルゴリズムを実装しています。クイックソートは一種の分割と conquers アルゴリズムで、配列を分割し、それぞれの部分配列を再帰的にソートします。
この実装では、配列の最初の要素をピボットとして使用します。そして、配列を2つの
# Evaluations
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|-------------|-------|------|-----:|--------|-----:|---|-----:|
|arc_challenge|Yaml |none | 0|acc |0.5495|± |0.0145|
| | |none | 0|acc_norm|0.5794|± |0.0144|
|arc_easy |Yaml |none | 0|acc |0.8304|± |0.0077|
| | |none | 0|acc_norm|0.8068|± |0.0081|
|boolq |Yaml |none | 0|acc |0.8749|± |0.0058|
|hellaswag |Yaml |none | 0|acc |0.6276|± |0.0048|
| | |none | 0|acc_norm|0.8157|± |0.0039|
|openbookqa |Yaml |none | 0|acc |0.3180|± |0.0208|
| | |none | 0|acc_norm|0.4460|± |0.0223|
|piqa |Yaml |none | 0|acc |0.8139|± |0.0091|
| | |none | 0|acc_norm|0.8237|± |0.0089|
|winogrande |Yaml |none | 0|acc |0.7419|± |0.0123|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_macadeliccc__laser-polyglot-4x7b)
| Metric |Value|
|---------------------------------|----:|
|Avg. |65.79|
|AI2 Reasoning Challenge (25-Shot)|64.16|
|HellaSwag (10-Shot) |84.98|
|MMLU (5-Shot) |63.88|
|TruthfulQA (0-shot) |55.47|
|Winogrande (5-shot) |77.82|
|GSM8k (5-shot) |48.45|