abhinavnmagic
commited on
Commit
•
53f73c8
1
Parent(s):
f14841a
Update README.md
Browse files
README.md
CHANGED
@@ -49,11 +49,13 @@ Model evaluation metrics and results.
|
|
49 |
|
50 |
| Benchmark | Metric | Llama-2-7b-evolcodealpaca | Llama-2-7b-pruned70-retrained-evolcodealpaca |
|
51 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
52 |
-
| [HumanEval](https://arxiv.org/abs/2107.03374) | pass@1 |
|
53 |
|
54 |
## Model Training Details
|
55 |
|
56 |
-
|
|
|
|
|
57 |
|
58 |
## Help
|
59 |
|
|
|
49 |
|
50 |
| Benchmark | Metric | Llama-2-7b-evolcodealpaca | Llama-2-7b-pruned70-retrained-evolcodealpaca |
|
51 |
|------------------------------------------------|---------------|-------------|-------------------------------|
|
52 |
+
| [HumanEval](https://arxiv.org/abs/2107.03374) | pass@1 | 32.03 | 33.8 |
|
53 |
|
54 |
## Model Training Details
|
55 |
|
56 |
+
This model was obtained by gradual sparse-transfer of the sparse foundational model [Llama-2-7b-pruned50-retrained](https://huggingface.co/neuralmagic/Llama-2-7b-pruned50-retrained) on 60% of the [evolcodealpaca](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) dataset.
|
57 |
+
The 50% sparse foundational model was finetuned for 2 epochs and then pruned to 70% sparsity using [SparseGPT](https://arxiv.org/abs/2301.00774). Then, the model was finetuned for 1 more epoch with the [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation with [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) as teacher.
|
58 |
+
|
59 |
|
60 |
## Help
|
61 |
|