RWKV/v5-EagleX-v2-7B-pth

RWKV EagleX 7B v2 Model

!Important!: This is not meant to be used with huggingface transformers library
Use the Hugging Face varient instead, found here (v5-EagleX-v2-7B-HF)

The following is the raw representation of the EagleX 7B v2 model. For use with our own set of trainers

This is not an instruct tune model! (soon...)

Quickstart with the hugging face transformer library

See the huggingface version here (v5-EagleX-v2-7B-HF)

model = AutoModelForCausalLM.from_pretrained("RWKV/v5-Eagle-7B-HF", trust_remote_code=True).to(torch.float32)
tokenizer = AutoTokenizer.from_pretrained("RWKV/v5-Eagle-7B-HF", trust_remote_code=True)

Evaluation

The following shows the progression of the model from 1.1T trained to 2.25T trained.

Model	Eagle-7B-HF	EagleX-7B-HF-v1	EagleX-7B-HF-v2
Param Count	7.52 B	7.52 B	7.52 B
Tokens Trained	1.1 T	1.7 T	2.25 T
avg_acc	0.4822	0.5391	0.5495
glue (acc)	0.5752	0.7463	0.7439
anli (acc)	0.3594	0.4847	0.5097
mnli (acc)	0.3802	0.7928	0.7884
mnli_mismatch (acc)	0.3687	0.7985	0.784
swag (acc)	0.568	0.5814	0.5905
lambada_standard (acc)	0.685	0.686	0.7004
lambada_openai (acc)	0.7425	0.7522	0.7502
mmlu (acc)	0.3321	0.4014	0.438
winogrande (acc)	0.674	0.7206	0.7332
wnli (acc)	0.4225	0.4648	0.493
truthfulqa (acc)	0.3303	0.3268	0.3401
logiqa (acc)	0.2458	0.2458	0.2458
logiqa2 (acc)	0.2494	0.2595	0.2621
sciq (acc)	0.955	0.96	0.93
piqa (acc)	0.7704	0.7758	0.7764
arc_easy (acc)	0.7382	0.7555	0.7445
arc_challenge (acc)	0.3951	0.4087	0.4155
hellaswag (acc)	0.5264	0.5411	0.56
openbookqa (acc)	0.302	0.296	0.304
mathqa (acc)	0.26	0.26	0.2593
arithmetic (acc)	0.245	0.0634	0.1703

Compared against other top performing models in the same weight class.

Model	OLMo-7B	falcon-7b	Llama-2-7b-hf	EagleX-7B-HF-v2	Mistral-7B-v0.1
Param Count	6.89 B	6.92 B	6.74 B	7.52 B	7.24 B
Tokens Trained	2.5 T	1.5 T	2 T	2.25 T	2 - 7 T?
avg_acc	0.4578	0.4775	0.5045	0.5495	0.5676
glue (acc)	0.474	0.4578	0.4289	0.7439	0.515
anli (acc)	0.3478	0.3541	0.3697	0.5097	0.3803
mnli (acc)	0.3294	0.3893	0.4269	0.7884	0.4542
mnli_mismatch (acc)	0.3348	0.404	0.4395	0.784	0.4632
swag (acc)	0.5512	0.5685	0.5658	0.5905	0.5756
lambada_standard (acc)	0.6396	0.6868	0.6808	0.7004	0.6944
lambada_openai (acc)	0.6872	0.746	0.7353	0.7502	0.7553
mmlu (acc)	0.2812	0.2512	0.4077	0.438	0.5964
winogrande (acc)	0.6725	0.6709	0.6914	0.7332	0.7364
wnli (acc)	0.5775	0.4789	0.4648	0.493	0.5775
truthfulqa (acc)	0.3015	0.2826	0.3205	0.3401	0.3537
logiqa (acc)	0.2335	0.2151	0.2535	0.2458	0.2427
logiqa2 (acc)	0.2506	0.2252	0.2564	0.2621	0.3022
sciq (acc)	0.927	0.944	0.939	0.93	0.959
piqa (acc)	0.7878	0.7949	0.7807	0.7764	0.8052
arc_easy (acc)	0.7353	0.7479	0.7643	0.7445	0.8081
arc_challenge (acc)	0.3677	0.4027	0.4309	0.4155	0.5009
hellaswag (acc)	0.5572	0.5772	0.5713	0.56	0.6131
openbookqa (acc)	0.292	0.306	0.316	0.304	0.33
mathqa (acc)	0.26	0.2884	0.2801	0.2593	0.3554
arithmetic (acc)	0.0069	0.2367	0.4703	0.1703	0.9004

See the following, for the full details on this model: https://blog.rwkv.com/p/eaglex-v2-soaring-past-llama2-7b

Acknowledgement

We are grateful for the help and support from the following key groups:

Recursal.ai team for financing the GPU resources, and managing the training of this foundation model - you can run the Eagle line of RWKV models on their cloud / on-premise platform today.
EleutherAI for their support, especially in the v5/v6 Eagle/Finch paper
Linux Foundation AI & Data group for supporting and hosting the RWKV project

RWKV
/

v5-EagleX-v2-7B-pth

RWKV EagleX 7B v2 Model

Quickstart with the hugging face transformer library

Evaluation

Links

Acknowledgement

Datasets used to train RWKV/v5-EagleX-v2-7B-pth

Spaces using RWKV/v5-EagleX-v2-7B-pth 2