Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ base_model: meta-llama/Llama-2-70b-hf
|
|
14 |
|
15 |
This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
|
16 |
|
17 |
-
|
18 |
|
19 |
| Metric | Value |
|
20 |
|-----------------------|-------|
|
@@ -26,7 +26,7 @@ This instruction model was built via parameter-efficient QLoRA finetuning of [ll
|
|
26 |
|
27 |
We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
28 |
|
29 |
-
|
30 |
|
31 |
* Model license: Llama 2 Community License Agreement
|
32 |
* Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
|
@@ -40,7 +40,7 @@ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/E
|
|
40 |
|
41 |
The above loss curve was generated from the run's private wandb.ai log.
|
42 |
|
43 |
-
|
44 |
|
45 |
Example 1:
|
46 |
|
@@ -136,7 +136,7 @@ The llama-2-70b models have been modified from a standard transformer in the fol
|
|
136 |
| sequence length | 4096 |
|
137 |
| grouped-query attention | ✔️ |
|
138 |
|
139 |
-
##
|
140 |
|
141 |
For more details on the pretraining process, see [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf).
|
142 |
|
@@ -150,9 +150,9 @@ This model can produce factually incorrect output, and should not be relied on t
|
|
150 |
This model was trained on various public datasets.
|
151 |
While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
|
152 |
|
153 |
-
##
|
154 |
|
155 |
-
|
156 |
|
157 |
```python
|
158 |
!pip install -q -U huggingface_hub peft transformers torch accelerate
|
@@ -221,8 +221,7 @@ with torch.autocast("cuda", dtype=torch.bfloat16):
|
|
221 |
print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
|
222 |
```
|
223 |
|
224 |
-
|
225 |
-
### Runtime tests
|
226 |
|
227 |
| runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
|
228 |
|:-----------------------------:|:----------------------:|:---------------------:|:-------------:|:-----------------------:|
|
@@ -253,7 +252,7 @@ The license on this model does not constitute legal advice. We are not responsib
|
|
253 |
|
254 |
---
|
255 |
|
256 |
-
|
257 |
|
258 |
|
259 |
- PEFT 0.5.0.dev0
|
|
|
14 |
|
15 |
This instruction model was built via parameter-efficient QLoRA finetuning of [llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b-hf) on the first 25k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) (an open-source implementation of [Microsoft's Orca](https://www.microsoft.com/en-us/research/publication/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4/)). Finetuning was executed on a single H100 (80 GB PCIe) for roughly 17 hours on the [Lambda Labs](https://cloud.lambdalabs.com/instances) platform.
|
16 |
|
17 |
+
## Benchmark metrics
|
18 |
|
19 |
| Metric | Value |
|
20 |
|-----------------------|-------|
|
|
|
26 |
|
27 |
We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, using the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
28 |
|
29 |
+
## Helpful links
|
30 |
|
31 |
* Model license: Llama 2 Community License Agreement
|
32 |
* Basic usage: [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
|
|
|
40 |
|
41 |
The above loss curve was generated from the run's private wandb.ai log.
|
42 |
|
43 |
+
## Example prompts and responses
|
44 |
|
45 |
Example 1:
|
46 |
|
|
|
136 |
| sequence length | 4096 |
|
137 |
| grouped-query attention | ✔️ |
|
138 |
|
139 |
+
## Pre-training data
|
140 |
|
141 |
For more details on the pretraining process, see [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf).
|
142 |
|
|
|
150 |
This model was trained on various public datasets.
|
151 |
While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
|
152 |
|
153 |
+
## Basic usage
|
154 |
|
155 |
+
* [notebook](assets/basic_inference_llama_2_dolphin.ipynb)
|
156 |
|
157 |
```python
|
158 |
!pip install -q -U huggingface_hub peft transformers torch accelerate
|
|
|
221 |
print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))
|
222 |
```
|
223 |
|
224 |
+
## Runtime tests
|
|
|
225 |
|
226 |
| runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
|
227 |
|:-----------------------------:|:----------------------:|:---------------------:|:-------------:|:-----------------------:|
|
|
|
252 |
|
253 |
---
|
254 |
|
255 |
+
## Framework versions
|
256 |
|
257 |
|
258 |
- PEFT 0.5.0.dev0
|