metadata

library_name: transformers
license: other
language:
  - ja

EvoLLM-v1-JP-7B

EvoLLM-v1-JP-7B is a evolved Japanese Math LLM.

Model Details

Model Description

EvoLLM-v1-JP-7B is a Japanese Math LLM, merged the following source models in the Parameter Space (PS) by using an evolutionary approach.

Developed by: Sakana AI
Model type: Autoregressive Language Model
Language(s): Japanese
License: MICROSOFT RESEARCH LICENSE TERMS
Source models:

Model Sources

Repository: SakanaAI/evolving-merged-models
Paper: TODO
Blog: TODO

Usage

Use the code below to get started with the model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


# 1. load model
device = "cuda" if torch.cuda.is_available() else "CPU"
repo_id = "SakanaAI/EvoLLM-v1-JP-7B"
model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model.to(device)

# 2. prepare inputs
template = """以下に、あるタスクを説明する指示があります。リクエストを適切に完了するための回答を日本語で記述してください。一歩一歩考えましょう。

### 指示:
{input}

### 応答:"""
text = "ミシュカは半ズボンを3本、長ズボンを3本、靴を3足買いました。半ズボンは1本$16.50でした。長ズボンは1本$22.50で、靴は1足$42でした。すべての衣類にいくら使いましたか？"
inputs = tokenizer(template.format(input=text), return_tensors="pt")

# 3. generate
output_ids = model.generate(**inputs.to(device))
output_ids = output_ids[:, inputs.input_ids.shape[1] :]
generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
print(generated_text)

Evaluation

We present the results on the MGSM-JA test set that compares the performance of the our evolved LLMs compared to the source LLMs. To reproduce the results, please use our Github repository.

Id.	Model	Type	Params	MGSM-JA (acc ↑ )
1	Shisa Gamma 7B v1	JA general	7B	9.6
2	WizardMath 7B V1.1	EN math	7B	18.4
3	Abel 7B 002	EN math	7B	30.0
4	Arithmo2 Mistral 7B	EN math	7B	24.0
5	(Ours) EvoLLM-v1-JP-7B	1+2+3	7B	52.0
6	(Ours) EvoLLM-v1-JP-7B-A	1+3+4	7B	52.4
7	(Ours) EvoLLM-v1-JP-10B	1 + 5	10B	55.6

Citation

@misc{sakana2024evofactory,
      title         = {Evolutionary Optimization of Model Merging Recipes}, 
      author.       = {Takuya Akiba and Makoto Shing and Yujin Tang and Qi Sun and David Ha},
      year          = {2024},
      eprint        = {TODO},
      archivePrefix = {arXiv},
      primaryClass  = {cs.CV}
}