nyunai
/

nyun-c2-llama3-61B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Arnav0400 commited on Jun 13

Commit

b429e01

•

1 Parent(s): 37ef209

Update README.md

Files changed (1) hide show

README.md +28 -3

README.md CHANGED Viewed

@@ -1,3 +1,28 @@
----
-license: llama3
----

+---
+license: llama3
+---
+# 🔹 Key Highlights:
+- 13% Fewer Parameters: nyun-c2-llama3-61B comprises approximately 13% fewer parameters than the popular Llama-3-70B.
+- Better Performance: Despite having fewer parameters, this model performs better than Llama3-70B on multiple benchmarks.
+- No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.
+## Pipeline and Collaboration
+For insights into the pipeline and the list of methods used to optimize these models, check out our PruneGPT repository (https://github.com/nyunAI/PruneGPT).
+We invite companies and organizations interested in joining forces with us to release more such open-source variants to reach out at contact@nyunai.com.
+### Model Performance
+| Dataset | nyun-c2-llama3-61B | Meta-Llama3-70B | Meta-Llama2-70B | MBZUAI K2-65B |
+| --- | --- | --- | --- | --- |
+| MMLU (5-shot) | 78.8 | 79.5 | 69.7 | 67.9 |
+| Winogrande (5-shot) | 86.2 | 83.1 | 81.8 | 77.0 |
+| BoolQ (0-shot) | 85.1 | 79.0 | 73.1 | 83.0 |
+| Hellaswag (10-shot) | 87.4 | 88.0 | 86.9 | 85.5 |
+| Arc Challenge (25-shot) | 67.6 | 68.8 | 67.2 | 64.8 |
+| GSM8K (5-shot) | 79.4 | 76.9 | 52.6 | 50.2 |
+| Average | 80.7 | 79.2 |  71.9 | 71.4 |
+- **Developed by:** [Nyun AI](https://nyunai.com/)
+- **Repository:** [Github](https://github.com/nyunAI/PruneGPT)