neuralmagic
/

Llama-2-7b-gsm8k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ekurtic commited on Jun 20

Commit

1d435c9

•

1 Parent(s): e599123

Update README.md

Files changed (1) hide show

README.md +3 -5

README.md CHANGED Viewed

@@ -43,15 +43,13 @@ print(tokenizer.decode(outputs[0]))
 Model evaluation metrics and results.
-| Benchmark                                      | Metric        | Llama-2-7b-gsm8k |
-|------------------------------------------------|---------------|-------------|
-| [GSM8K](https://arxiv.org/abs/2110.14168)      | 0-shot        | 35.5%       |
 ## Model Training Details
-sp0_2ep_lr3e-5_bs32_warmup20ba
 This model was obtained by fine-tuning the [dense Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the [GSM8k](https://huggingface.co/datasets/openai/gsm8k) dataset.
 Fine-tuning was performed for 2 epochs with batch-size of 32, with linearly decaying learning-rate from initial value of 3e-5 and warm-up phase of 20 steps.

 Model evaluation metrics and results.
+| Benchmark                                  | Metric  | Llama-2-7b-gsm8k |
+|:----:|:----:|:----:|
+| [GSM8K](https://arxiv.org/abs/2110.14168)  | 0-shot  | 35.5%            |
 ## Model Training Details
 This model was obtained by fine-tuning the [dense Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the [GSM8k](https://huggingface.co/datasets/openai/gsm8k) dataset.
 Fine-tuning was performed for 2 epochs with batch-size of 32, with linearly decaying learning-rate from initial value of 3e-5 and warm-up phase of 20 steps.