VerB-Etheria-55b / README.md

Adding Evaluation Results (#1)

afe8f1c verified 8 months ago

5.13 kB

	---
	license: apache-2.0
	tags:
	- mergekit
	- merge
	- Etheria
	base_model:
	- brucethemoose/Yi-34B-200K-DARE-megamerge-v8
	- one-man-army/UNA-34Beagles-32K-bf16-v1
	model-index:
	- name: VerB-Etheria-55b
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 65.96
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/VerB-Etheria-55b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 81.48
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/VerB-Etheria-55b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 73.78
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/VerB-Etheria-55b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 57.52
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/VerB-Etheria-55b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 75.45
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/VerB-Etheria-55b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 28.81
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/VerB-Etheria-55b
	name: Open LLM Leaderboard
	---
	# VerB-Etheria-55b

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/sawfieuCbKgQHl4iQhDN7.png)

	An attempt to make a functional goliath style merge to create a [Etheria] 55b-200k with two yi-34b-200k models, this is Version B or VerB, it is a Double
	Model Passthrough merge. with a 50/50 split between high performing models.


	# Roadmap:
	Depending on quality, I Might private the other Version. Then generate a sacrificial 55b and perform a 55b Dare ties merge or Slerp merge.

	1: If the Dual Model Merge performs well I will make a direct inverse of the config then merge.

	2: If the single model performs well I will generate a 55b of the most performant model the either Slerp or Dare ties merge.

	3: If both models perform well, then I will complete both 1 & 2 then change the naming scheme to match each of the new models.

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml

	dtype: bfloat16
	slices:
	- sources:
	- model: brucethemoose/Yi-34B-200K-DARE-megamerge-v8
	layer_range: [0, 14]
	- sources:
	- model: one-man-army/UNA-34Beagles-32K-bf16-v1
	layer_range: [7, 21]
	- sources:
	- model: brucethemoose/Yi-34B-200K-DARE-megamerge-v8
	layer_range: [15, 29]
	- sources:
	- model: one-man-army/UNA-34Beagles-32K-bf16-v1
	layer_range: [22, 36]
	- sources:
	- model: brucethemoose/Yi-34B-200K-DARE-megamerge-v8
	layer_range: [30, 44]
	- sources:
	- model: one-man-army/UNA-34Beagles-32K-bf16-v1
	layer_range: [37, 51]
	- sources:
	- model: brucethemoose/Yi-34B-200K-DARE-megamerge-v8
	layer_range: [45, 59]
	merge_method: passthrough

	```
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Steelskull__VerB-Etheria-55b)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|63.83\|
	\|AI2 Reasoning Challenge (25-Shot)\|65.96\|
	\|HellaSwag (10-Shot) \|81.48\|
	\|MMLU (5-Shot) \|73.78\|
	\|TruthfulQA (0-shot) \|57.52\|
	\|Winogrande (5-shot) \|75.45\|
	\|GSM8k (5-shot) \|28.81\|