nnpy
/

Nape-0

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Nape-0 / README.md

nnpy's picture

Adding Evaluation Results (#3)

947f89b 7 months ago

|

raw history blame contribute delete

No virus

1.21 kB

	---
	language:
	- en
	license: mit
	---

	Nape-0

	Nape series are small models that tries to exihibit much capabilities.
	The model is still in training process. This is very early preview.

	You can load it as follows:

	```
	from transformers import LlamaForCausalLM, AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained("nnpy/Nape-0")
	model = LlamaForCausalLM.from_pretrained("nnpy/Nape-0")
	```

	## Training
	It took 1 days to train 3 epochs on 4x A6000s using native deepspeed.

	```
	assistant role: You are Semica, a helpful AI assistant.
	user: {prompt}
	assistant:

	```
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_nnpy__Nape-0)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 30.93 \|
	\| ARC (25-shot) \| 32.68 \|
	\| HellaSwag (10-shot) \| 58.68 \|
	\| MMLU (5-shot) \| 24.88 \|
	\| TruthfulQA (0-shot) \| 38.99 \|
	\| Winogrande (5-shot) \| 57.3 \|
	\| GSM8K (5-shot) \| 0.08 \|
	\| DROP (3-shot) \| 3.89 \|