grimjim
/

Gigantes-v2-gemma2-9b-it

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Gigantes-v2-gemma2-9b-it / README.md

grimjim's picture

Initial release

5c410fb about 1 month ago

|

history blame contribute delete

2.2 kB

	---
	base_model:
	- aisingapore/gemma2-9b-cpt-sea-lionv3-instruct
	- AXCXEPT/EZO-Humanities-9B-gemma-2-it
	- princeton-nlp/gemma-2-9b-it-SimPO
	- silma-ai/SILMA-9B-Instruct-v1.0
	- VAGOsolutions/SauerkrautLM-gemma-2-9b-it
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- mergekit
	- merge
	license: gemma
	---
	# Gigantes-v2-gemma2-9b-it

	This repo contains a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	It was hoped that the contributions of Japanese, Singaporean, German, and Arabic instruct models would contribute to reasoning as well as increasing the complexity of English text generation. The Arabic model contribution was kept at low weight due to terse English responses. The German model was kept at lower weight due to additional fine-tuning only affecting a minority of the layers.

	## Merge Details
	### Merge Method

	This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [princeton-nlp/gemma-2-9b-it-SimPO](https://huggingface.co/princeton-nlp/gemma-2-9b-it-SimPO) as a base.

	### Models Merged

	The following models were included in the merge:
	* [aisingapore/gemma2-9b-cpt-sea-lionv3-instruct](https://huggingface.co/aisingapore/gemma2-9b-cpt-sea-lionv3-instruct)
	* [AXCXEPT/EZO-Humanities-9B-gemma-2-it](https://huggingface.co/AXCXEPT/EZO-Humanities-9B-gemma-2-it)
	* [silma-ai/SILMA-9B-Instruct-v1.0](https://huggingface.co/silma-ai/SILMA-9B-Instruct-v1.0)
	* [VAGOsolutions/SauerkrautLM-gemma-2-9b-it](https://huggingface.co/VAGOsolutions/SauerkrautLM-gemma-2-9b-it)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	base_model: princeton-nlp/gemma-2-9b-it-SimPO
	dtype: bfloat16
	merge_method: task_arithmetic
	parameters:
	normalize: true
	models:
	- model: princeton-nlp/gemma-2-9b-it-SimPO
	- model: AXCXEPT/EZO-Humanities-9B-gemma-2-it
	parameters:
	weight: 0.3
	- model: VAGOsolutions/SauerkrautLM-gemma-2-9b-it
	parameters:
	weight: 0.1
	- model: aisingapore/gemma2-9b-cpt-sea-lionv3-instruct
	parameters:
	weight: 0.2
	- model: silma-ai/SILMA-9B-Instruct-v1.0
	parameters:
	weight: 0.001

	```