Weyaxi
/

Qwen-72B-Llama

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Qwen-72B-Llama / README.md

Weyaxi's picture

add open llm leaderboard foto

a7fc8b7 verified 9 months ago

|

history blame contribute delete

1.54 kB

	---
	license: other
	license_name: qwen
	license_link: LICENSE
	---

	# 🦙 Qwen-72B-Llama

	This is the 🦙 llamafied version of [Qwen/Qwen-72B](https://huggingface.co/Qwen/Qwen-72B).

	## 🛠️ Reproduction

	I used [this script](https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py) to convert the weights:

	[LLaMA-Factory/tests/llamafy_qwen.py](https://github.com/hiyouga/LLaMA-Factory/blob/main/tests/llamafy_qwen.py)

	## 🔠 Tokenizer

	After I converted the weights, I took the tokenizer from [KnutJaegersberg/Qwen-14B-Llamafied](https://huggingface.co/KnutJaegersberg/Qwen-14B-Llamafied) and uploaded it to this repository.

	## 📊 Eval Scores Compared to Original Model

	Here are some of the evaluation score comparisons based on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).

	\| Metric \| Qwen-72B \| Qwen-72B-Llama \|
	\|-----------------------\|---------------\|--------------------\|
	\| Avg. \| 73.6 \| 69.53 \|
	\| ARC (25-shot) \| 65.19 \| 64.85 \|
	\| HellaSwag (10-shot) \| 85.94 \| 83.27 \|
	\| MMLU (5-shot) \| 77.37 \| 73.66 \|
	\| TruthfulQA (0-shot) \| 60.19 \| 57.6 \|
	\| Winogrande (5-shot) \| 82.48 \| 81.53 \|
	\| GSM8K (5-shot) \| 70.43 \| 56.25 \|


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/hRQRMYVPc4LyavE3GaI_T.png)