Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
- '3'
|
9 |
- 5B
|
10 |
---
|
11 |
-
This is just an experiment similar to that done on [chargoddard/llama3-42b-v0](https://huggingface.co/chargoddard/llama3-42b-v0). The post-pruning was fine-tuned or "healed" with QLoRA using the code DPO dataset [AlekseyKorshuk/evol-codealpaca-v1-dpo](https://huggingface.co/datasets/AlekseyKorshuk/evol-codealpaca-v1-dpo). Due to limitations, this was only trained on 3150/4935 (~64%) steps of the data. I had to restart the training about halfway through, so the logs are split in two.
|
12 |
|
13 |
Loss: ~1.2
|
14 |
<img src="https://i.imgur.com/AnuMlv7.png">
|
|
|
8 |
- '3'
|
9 |
- 5B
|
10 |
---
|
11 |
+
This is just an experiment similar to that done on [chargoddard/llama3-42b-v0](https://huggingface.co/chargoddard/llama3-42b-v0). The post-pruning was fine-tuned or "healed" with QLoRA using the code DPO dataset [AlekseyKorshuk/evol-codealpaca-v1-dpo](https://huggingface.co/datasets/AlekseyKorshuk/evol-codealpaca-v1-dpo). Due to limitations, this was only trained on 3150/4935 (~64%) steps of the data. I had to restart the training about halfway through, so the logs are split in two. I am still unsure if the tokenizer is correct.
|
12 |
|
13 |
Loss: ~1.2
|
14 |
<img src="https://i.imgur.com/AnuMlv7.png">
|