--- library_name: transformers license: other language: - ja --- # EvoLLM-v1-JP-7B EvoLLM-v1-JP-7B is a evolved Japanese Math LLM. ## Model Details ### Model Description EvoLLM-v1-JP-7B is a Japanese Math LLM, merged the following source models in the Parameter Space (PS) by using an evolutionary approach. - **Developed by:** [Sakana AI](https://sakana.ai/) - **Model type:** Autoregressive Language Model - **Language(s):** Japanese - **License:** [MICROSOFT RESEARCH LICENSE TERMS](./LICENSE) - **Source models:** - [augmxnt/shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) - [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) - [GAIR/Abel-7B-002](https://huggingface.co/GAIR/Abel-7B-002) ### Model Sources - **Repository:** [SakanaAI/evolving-merged-models](https://github.com/SakanaAI/evolving-merged-models) - **Paper:** TODO - **Blog:** TODO ## Usage Use the code below to get started with the model. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # 1. load model device = "cuda" if torch.cuda.is_available() else "CPU" repo_id = "SakanaAI/EvoLLM-v1-JP-7B" model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto") tokenizer = AutoTokenizer.from_pretrained(repo_id) model.to(device) # 2. prepare inputs template = """以下に、あるタスクを説明する指示があります。リクエストを適切に完了するための回答を日本語で記述してください。一歩一歩考えましょう。 ### 指示: {input} ### 応答:""" text = "ミシュカは半ズボンを3本、長ズボンを3本、靴を3足買いました。半ズボンは1本$16.50でした。長ズボンは1本$22.50で、靴は1足$42でした。すべての衣類にいくら使いましたか?" inputs = tokenizer(template.format(input=text), return_tensors="pt") # 3. generate output_ids = model.generate(**inputs.to(device)) output_ids = output_ids[:, inputs.input_ids.shape[1] :] generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0] print(generated_text) ``` ## Evaluation We present the results on the [MGSM-JA](https://huggingface.co/datasets/juletxara/mgsm) test set that compares the performance of the our evolved LLMs compared to the source LLMs. To reproduce the results, please use [our Github repository](https://github.com/SakanaAI/evolving-merged-models). | Id. | Model | Type | Params | MGSM-JA (acc ↑ ) | | :--: | :-- | :-- | --: | --: | | 1 | [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | JA general | 7B |9.6 | | 2 | [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | EN math | 7B | 18.4 | | 3 | [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002) | EN math | 7B | 30.0 | | 4 | [Arithmo2 Mistral 7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B) | EN math | 7B | 24.0 | | 5 | [(Ours) EvoLLM-v1-JP-7B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B) | 1+2+3 | 7B | **52.0** | | 6 | [(Ours) EvoLLM-v1-JP-7B-A](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-7B-A) | 1+3+4 | 7B | **52.4** | | 7 | [(Ours) EvoLLM-v1-JP-10B](https://huggingface.co/SakanaAI/EvoLLM-v1-JP-10B) | 1 + 5 | 10B | **55.6** | ## Citation ```bibtex @misc{sakana2024evofactory, title = {Evolutionary Optimization of Model Merging Recipes}, author. = {Takuya Akiba and Makoto Shing and Yujin Tang and Qi Sun and David Ha}, year = {2024}, eprint = {TODO}, archivePrefix = {arXiv}, primaryClass = {cs.CV} } ```