SakanaAI
/

EvoLLM-JP-v1-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

EvoLLM-JP-v1-7B / README.md

mkshing's picture

Update README.md

889ad25 verified 8 months ago

|

2.92 kB

	---
	library_name: transformers
	license: other
	language:
	- ja
	---

	# EvoLLM-v1-JP-7B

	<!-- Provide a quick summary of what the model is/does. -->
	EvoLLM-v1-JP-7B is a evolved Japanese Math LLM.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->
	EvoLLM-v1-JP-7B is a Japanese Math LLM, merged the following source models in the Parameter Space (PS) by using an evolutionary approach.

	- Developed by: [Sakana AI](https://sakana.ai/)
	- Model type: Autoregressive Language Model
	- Language(s): Japanese
	- License: [MICROSOFT RESEARCH LICENSE TERMS](./LICENSE)
	- Source models:
	- [augmxnt/shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1)
	- [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
	- [GAIR/Abel-7B-002](https://huggingface.co/GAIR/Abel-7B-002)

	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: [SakanaAI/evolving-merged-models](https://github.com/SakanaAI/evolving-merged-models)
	- Paper: TODO
	- Blog: TODO


	## Usage

	Use the code below to get started with the model.


	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer


	# 1. load model
	device = "cuda" if torch.cuda.is_available() else "CPU"
	repo_id = "SakanaAI/EvoLLM-v1-JP-7B"
	model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")
	tokenizer = AutoTokenizer.from_pretrained(repo_id)
	model.to(device)

	# 2. prepare inputs
	template = """以下に、あるタスクを説明する指示があります。リクエストを適切に完了するための回答を日本語で記述してください。一歩一歩考えましょう。

	### 指示:
	{input}

	### 応答:"""
	text = "ミシュカは半ズボンを3本、長ズボンを3本、靴を3足買いました。半ズボンは1本$16.50でした。長ズボンは1本$22.50で、靴は1足$42でした。すべての衣類にいくら使いましたか？"
	inputs = tokenizer(template.format(input=text), return_tensors="pt")

	# 3. generate
	output_ids = model.generate(**inputs.to(device))
	output_ids = output_ids[:, inputs.input_ids.shape[1] :]
	generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
	print(generated_text)
	```

	## Evaluation

	We present the results that compares the performance of the our evolved LLMs compared to the source LLMs. To reproduce the results, please use [our Github repository](https://github.com/SakanaAI/evolving-merged-models).

	![eval-results](./evollm-math-results.png)


	## Citation

	```bibtex
	@misc{sakana2024evofactory,
	title = {Evolutionary Optimization of Model Merging Recipes},
	author. = {Takuya Akiba and Makoto Shing and Yujin Tang and Qi Sun and David Ha},
	year = {2024},
	eprint = {TODO},
	archivePrefix = {arXiv},
	primaryClass = {cs.CV}
	}
	```