neuralmagic
/

Llama-2-7b-ultrachat200k-pruned_70

Text Generation

Transformers

Safetensors

text-generation-inference

9 papers

Model card Files Files and versions Community

mgoin

alexmarques commited on Mar 18

Commit

c7099f7

•

1 Parent(s): 77e923a

Update README.md (#1)

Browse files

- Update README.md (ede9ad8daf30147c59549f06235a0d37d34a1e4f)

Co-authored-by: Alexandre Marques <alexmarques@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +10 -8

README.md CHANGED Viewed

@@ -49,17 +49,19 @@ Model evaluation metrics and results.
 | Benchmark                                      | Metric        | Llama-2-7b-ultrachat  | Llama-2-7b-pruned70-retrained-ultrachat |
 |------------------------------------------------|---------------|-------------|-------------------------------|
-| [MMLU](https://arxiv.org/abs/2009.03300)       | 5-shot, top-1 | xxxx        | xxxx                          |
-| [HellaSwag](https://arxiv.org/abs/1905.07830)  | 0-shot        | xxxx        | xxxx                          |
-| [WinoGrande](https://arxiv.org/abs/1907.10641) | partial score | xxxx        | xxxx                          |
-| [ARC-c](https://arxiv.org/abs/1911.01547)      |               | xxxx        | xxxx                          |
-| [TruthfulQA](https://arxiv.org/abs/2109.07958) | 5-shot        | xxxx        | xxxx                          |
-| [HumanEval](https://arxiv.org/abs/2107.03374)  | pass@1        | xxxx        | xxxx                          |
-| [GSM8K](https://arxiv.org/abs/2110.14168)      | maj@1         | xxxx        | xxxx                          |
 ## Model Training Details
-Coming soon.
 ## Help

 | Benchmark                                      | Metric        | Llama-2-7b-ultrachat  | Llama-2-7b-pruned70-retrained-ultrachat |
 |------------------------------------------------|---------------|-------------|-------------------------------|
+| [MMLU](https://arxiv.org/abs/2009.03300)       | 5-shot        | 46.1%       | 32.5%                         |
+| [HellaSwag](https://arxiv.org/abs/1905.07830)  | 0-shot        | 75.9%       | 68.9%                         |
+| [WinoGrande](https://arxiv.org/abs/1907.10641) | 5-shot        | 72.6%       | 65.1%                         |
+| [ARC-c](https://arxiv.org/abs/1911.01547)      | 25-shot       | 52.8%       | 45.3%                         |
+| [TruthfulQA](https://arxiv.org/abs/2109.07958) | 5-shot        | 44.8%       | 39.6%                         |
+| [GSM8K](https://arxiv.org/abs/2110.14168)      | 5-shot        | 12.4%       | 4.8%                          |
+| [AlpacaEval](https://arxiv.org/abs/2107.03374) ([Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) evaluator) | Win rate  | 57.6% | 57.4% |
+| [AlpacaEval](https://arxiv.org/abs/2107.03374) (GPT-4 Turbo evaluator) | Win rate  | 60.6% | 54.0% |
 ## Model Training Details
+This model was obtained by sparse-tranfer of the sparse foundational model [Llama-2-7b-pruned70-retrained](https://huggingface.co/neuralmagic/Llama-2-7b-pruned70-retrained) on the [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset.
+Training was perfomerd for 2 epochs and used the [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation with [Llama-2-7b-ultrachat](https://huggingface.co/neuralmagic/Llama-2-7b-ultrachat) as teacher.
 ## Help