RDson
/

Llama-3-5B-Experimental

Text Generation

text-generation-inference

Model card Files Files and versions Community

RDson commited on Apr 30, 2024

Commit

fc8053b

·

1 Parent(s): f9bc2c5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ tags:
 - '3'
 - 5B
 ---
-This is just an experiment similar to that done on [chargoddard/llama3-42b-v0](https://huggingface.co/chargoddard/llama3-42b-v0). The post-pruning was fine-tuned or "healed" with QLoRA using the code DPO dataset [AlekseyKorshuk/evol-codealpaca-v1-dpo](https://huggingface.co/datasets/AlekseyKorshuk/evol-codealpaca-v1-dpo). Due to limitations, this was only trained on 3150/4935 (~64%) steps of the data. I had to restart the training about halfway through, so the logs are split in two.
 Loss: ~1.2
 <img src="https://i.imgur.com/AnuMlv7.png">

 - '3'
 - 5B
 ---
+This is just an experiment similar to that done on [chargoddard/llama3-42b-v0](https://huggingface.co/chargoddard/llama3-42b-v0). The post-pruning was fine-tuned or "healed" with QLoRA using the code DPO dataset [AlekseyKorshuk/evol-codealpaca-v1-dpo](https://huggingface.co/datasets/AlekseyKorshuk/evol-codealpaca-v1-dpo). Due to limitations, this was only trained on 3150/4935 (~64%) steps of the data. I had to restart the training about halfway through, so the logs are split in two. I am still unsure if the tokenizer is correct.
 Loss: ~1.2
 <img src="https://i.imgur.com/AnuMlv7.png">