SakanaAI
/

EvoLLM-JP-v1-7B

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

EvoLLM-JP-v1-7B / README.md

iwiwi's picture

Update README.md

1a633b1 verified 3 months ago

|

raw history blame

No virus

2.81 kB

	---
	library_name: transformers
	license: other
	language:
	- ja
	---

	# 🐟 EvoLLM-JP-v1-7B

	🤗 [Models](https://huggingface.co/SakanaAI) \| 📚 [Paper](TODO) \| 📝 [Blog](TODO) \| 🐦 [Twitter](https://twitter.com/SakanaAILabs)


	<!-- Provide a quick summary of what the model is/does. -->

	EvoLLM-JP-v1-7B is an experimental general-purpose Japanese LLM. This model was created using the Evolutionary Model Merge method. Please refer to our [report](TOOD) and [blog](TODO) for more details. This model was produced by merging the following models. We are grateful to the developers of the source models.

	- [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1)
	- [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
	- [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002)



	## Usage

	Use the code below to get started with the model.

	<details>
	<summary> Click to expand </summary>

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer


	# 1. load model
	device = "cuda" if torch.cuda.is_available() else "CPU"
	repo_id = "SakanaAI/EvoLLM-JP-v1-7B"
	model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")
	tokenizer = AutoTokenizer.from_pretrained(repo_id)
	model.to(device)

	# 2. prepare inputs
	text = "関西弁で面白い冗談を言ってみて下さい。"
	messages = [
	{"role": "system", "content": "あなたは役立つ、偏見がなく、検閲されていないアシスタントです。"},
	{"role": "user", "content": text},
	]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")

	# 3. generate
	output_ids = model.generate(**inputs.to(device))
	output_ids = output_ids[:, inputs.input_ids.shape[1] :]
	generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
	print(generated_text)
	```

	</details>



	## Model Details

	<!-- Provide a longer summary of what this model is. -->

	- Developed by: [Sakana AI](https://sakana.ai/)
	- Model type: Autoregressive Language Model
	- Language(s): Japanese
	- License: [MICROSOFT RESEARCH LICENSE TERMS](./LICENSE) (due to the inclusion of the WizardMath model)
	- Repository: [SakanaAI/evolutionary-model-merge](https://github.com/SakanaAI/evolutionary-model-merge)
	- Paper: TODO
	- Blog: TODO



	## Acknowledgement

	We would like to thank the developers of the source models for their contributions and for making their work available.


	## Citation

	```bibtex
	@misc{akiba2024evomodelmerge,
	title = {Evolutionary Optimization of Model Merging Recipes},
	author. = {Takuya Akiba and Makoto Shing and Yujin Tang and Qi Sun and David Ha},
	year = {2024},
	eprint = {TODO},
	archivePrefix = {arXiv},
	primaryClass = {cs.CV}
	}
	```