MaziyarPanahi
/

Calme-7B-Instruct-v0.2

@@ -38,6 +38,29 @@ tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Calme-7B-Instruct-v0.2"
 model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Calme-7B-Instruct-v0.2")
 ```
 ### Quantized Models
 > I love how GGUF democratizes the use of Large Language Models (LLMs) on commodity hardware, more specifically, personal computers without any accelerated hardware. Because of this, I am committed to converting and quantizing any models I fine-tune to make them accessible to everyone!

 model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Calme-7B-Instruct-v0.2")
 ```
+### Eval
+| Metric    | [Mistral-7B Instruct v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | [Calme-7B v0.1](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.1)  | [Calme-7B v0.2](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.2)  | [Calme-7B v0.3](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.3) | [Calme-7B v0.4](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.4)  | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5)  | [Calme-4x7B v0.1](https://huggingface.co/MaziyarPanahi/Calme-4x7B-MoE-v0.1) | [Calme-4x7B v0.2](https://huggingface.co/MaziyarPanahi/Calme-4x7B-MoE-v0.2) |
+|-----------|--------------------------|-------|-------|-------|-------|-------|------------|------------|
+| ARC       | 63.14                    | 67.24 | 67.75 | 67.49 | 64.85 | 67.58 | 67.15      | 76.66      |
+| HellaSwag | 84.88                    | 85.57 | 87.52 | 87.57 | 86.00 | 87.26 | 86.89      | 86.84      |
+| TruthfulQA| 68.26                    | 59.38 | 78.41 | 78.31 | 70.52 | 74.03 | 73.30      | 73.06      |
+| MMLU      | 60.78                    | 64.97 | 61.83 | 61.93 | 62.01 | 62.04 | 62.16      | 62.16      |
+| Winogrande| 77.19                    | 83.35 | 82.08 | 82.32 | 79.48 | 81.85 | 80.82      | 81.06      |
+| GSM8k     | 40.03                    | 69.29 | 73.09 | 73.09 | 77.79 | 73.54 | 74.53      | 75.66      |
+Some extra information to help you pick the right `Calme-7B` model:
+| Use Case Category                               | Recommended Calme-7B Model | Reason                                                                                   |
+|-------------------------------------------------|-----------------------------|------------------------------------------------------------------------------------------|
+| Educational Tools and Academic Research         | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5)               | Balanced performance, especially strong in TruthfulQA for accuracy and broad knowledge.  |
+| Commonsense Reasoning and Natural Language Apps | [Calme-7B v0.2](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.2) or [Calme-7B v0.3](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.3) | High performance in HellaSwag for understanding nuanced scenarios.                      |
+| Trustworthy Information Retrieval Systems       | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5)               | Highest score in TruthfulQA, indicating reliable factual information provision.          |
+| Math Educational Software                       | [Calme-7B v0.4](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.4)               | Best performance in GSM8k, suitable for numerical reasoning and math problem-solving.    |
+| Context Understanding and Disambiguation        | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5)               | Solid performance in Winogrande, ideal for text with context and pronoun disambiguation. |
 ### Quantized Models
 > I love how GGUF democratizes the use of Large Language Models (LLMs) on commodity hardware, more specifically, personal computers without any accelerated hardware. Because of this, I am committed to converting and quantizing any models I fine-tune to make them accessible to everyone!