--- library_name: transformers license: other language: - ja --- # 🐟 EvoLLM-JP-v1-7B 🤗 [Models](https://huggingface.co/SakanaAI) | 📚 [Paper](TODO) | 📝 [Blog](TODO) | đŸĻ [Twitter](https://twitter.com/SakanaAILabs) **EvoLLM-JP-v1-7B** is a Japanese Math LLM by Evolutionary Model Merge. ## Model Details ### Model Description **EvoLLM-JP-v1-7B** is a Japanese Math LLM, merged the following source models in the Parameter Space (PS) by Evolutionary Model Merge. - **Developed by:** [Sakana AI](https://sakana.ai/) - **Model type:** Autoregressive Language Model - **Language(s):** Japanese - **License:** [MICROSOFT RESEARCH LICENSE TERMS](./LICENSE) - **Source models:** - [augmxnt/shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) - [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) - [GAIR/Abel-7B-002](https://huggingface.co/GAIR/Abel-7B-002) ### Model Sources - **Repository:** [SakanaAI/evolutionary-model-merge](https://github.com/SakanaAI/evolutionary-model-merge) - **Paper:** TODO - **Blog:** TODO ## Usage Use the code below to get started with the model. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # 1. load model device = "cuda" if torch.cuda.is_available() else "CPU" repo_id = "SakanaAI/EvoLLM-JP-v1-7B" model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto") tokenizer = AutoTokenizer.from_pretrained(repo_id) model.to(device) # 2. prepare inputs template = """äģĨ下ãĢ、あるã‚ŋ゚クをčĒŦ明する指į¤ēがありぞす。ãƒĒクエ゚トを遊切ãĢ厌äē†ã™ã‚‹ãŸã‚ãŽå›žį­”ã‚’æ—ĨæœŦčĒžã§č¨˜čŋ°ã—ãĻãã ã•ã„ã€‚ä¸€æ­Šä¸€æ­Šč€ƒãˆãžã—ã‚‡ã†ã€‚ ### 指į¤ē: {input} ### åŋœį­”:""" text = "ãƒŸã‚ˇãƒĨã‚Ģは半ã‚ēボãƒŗを3æœŦã€é•ˇã‚ēボãƒŗを3æœŦ、靴を3čļŗč˛ˇã„ãžã—ãŸã€‚åŠã‚ēボãƒŗは1æœŦ$16.50ã§ã—ãŸã€‚é•ˇã‚ēボãƒŗは1æœŦ$22.50で、靴は1čļŗ$42でした。すずãĻぎčĄŖ類ãĢいくらäŊŋいぞしたかīŧŸ" inputs = tokenizer(template.format(input=text), return_tensors="pt") # 3. generate output_ids = model.generate(**inputs.to(device)) output_ids = output_ids[:, inputs.input_ids.shape[1] :] generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0] print(generated_text) ``` ## Evaluation For details on the evaluation, please refer to Section 4.1 of the paper. If you want to reproduce the results, please see [our Github repository](https://github.com/SakanaAI/evolutionary-model-merge). | Id. | Model | Type | Params | MGSM-JA (acc ↑ ) | | :--: | :-- | :-- | --: | --: | | 1 | [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | JA general | 7B |9.6 | | 2 | [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | EN math | 7B | 18.4 | | 3 | [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002) | EN math | 7B | 30.0 | | 4 | [Arithmo2 Mistral 7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B) | EN math | 7B | 24.0 | | 5 | [EvoLLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-7B) | 1+2+3 | 7B | **52.0** | | 6 | [EvoLLM-JP-A-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-A-v1-7B) | 1+3+4 | 7B | **52.4** | | 7 | [EvoLLM-JP-v1-10B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B) | 1 + 5 | 10B | **55.6** | ## Acknowledgement We would like to thank the developers of the source models for their contributions and for making their work available. ## Citation ```bibtex @misc{sakana2024evofactory, title = {Evolutionary Optimization of Model Merging Recipes}, author. = {Takuya Akiba and Makoto Shing and Yujin Tang and Qi Sun and David Ha}, year = {2024}, eprint = {TODO}, archivePrefix = {arXiv}, primaryClass = {cs.CV} } ```