ebowwa
/

toxic-dpo-v0.2-llama-3-01-beta

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ebowwa commited on May 17

Commit

cacbedb

•

1 Parent(s): aef8a38

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ tags:
 - llama
 - trl
 base_model: unsloth/llama-3-8b-bnb-4bit
 ---
 i initially fine-tuned with a dpo dataset so headers: prompt, chosen, rejected.
@@ -120,4 +122,4 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 https://colab.research.google.com/drive/14ArcJ4hR613jH0HxYcT734it_HVHG_bb?usp=sharing
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - llama
 - trl
 base_model: unsloth/llama-3-8b-bnb-4bit
+datasets:
+- unalignment/toxic-dpo-v0.2
 ---
 i initially fine-tuned with a dpo dataset so headers: prompt, chosen, rejected.
 https://colab.research.google.com/drive/14ArcJ4hR613jH0HxYcT734it_HVHG_bb?usp=sharing
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)