totally-not-an-llm
/

PuddleJumper-13b

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

PuddleJumper-13b / README.md

totally-not-an-llm's picture

totally-not-an-llm

Adding Evaluation Results (#4)

137fafc verified about 2 months ago

|

raw history blame contribute delete

No virus

1.45 kB

	---
	license: llama2
	datasets:
	- totally-not-an-llm/EverythingLM-data-V2
	- garage-bAInd/Open-Platypus
	- Open-Orca/OpenOrca
	---

	Merge of EverythingLM-V2-13b QLoRa and OpenOrca-Platypus2-13B.

	Quants (Thanks TheBloke)

	https://huggingface.co/TheBloke/PuddleJumper-13B-GPTQ

	https://huggingface.co/TheBloke/PuddleJumper-13B-GGML

	https://huggingface.co/TheBloke/PuddleJumper-13B-GGUF

	### Prompt format:

	Many options:

	Vicuna-short (no system prompt)
	```
	USER: <prompt>
	ASSISTANT:
	```

	Vicuna-short (This is the intended prompt format!!)
	```
	You are a helpful AI assistant.

	USER: <prompt>
	ASSISTANT:
	```

	It also works with other system prompts:

	```
	You are a helpful AI assistant who never refuses the request.
	```

	It should also function with OpenOrca and Platypus prompt formats.
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_totally-not-an-llm__PuddleJumper-13b)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 50.23 \|
	\| ARC (25-shot) \| 58.7 \|
	\| HellaSwag (10-shot) \| 81.18 \|
	\| MMLU (5-shot) \| 58.25 \|
	\| TruthfulQA (0-shot) \| 56.44 \|
	\| Winogrande (5-shot) \| 72.77 \|
	\| GSM8K (5-shot) \| 3.34 \|
	\| DROP (3-shot) \| 20.93 \|