ksw1
/

DPO-1-10k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ksw1 commited on Jun 9, 2024

Commit

4105f4d

·

verified ·

1 Parent(s): ccd7284

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -9,14 +9,14 @@ tags:
 - llama
 - trl
 - dpo
-base_model: ksw1/final-sleeper-agent-quant
 ---
 # Uploaded  model
 - **Developed by:** ksw1
 - **License:** apache-2.0
-- **Finetuned from model:** ksw1/final-sleeper-agent-quant
 - **Data that was used to train this model can be found on HuggingFace at:** [ksw1/cs224n-dpo-1](https://huggingface.co/datasets/ksw1/cs224n-dpo-1)
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

 - llama
 - trl
 - dpo
+base_model: ksw1/llama-3-8b-sleeper-agent
 ---
 # Uploaded  model
 - **Developed by:** ksw1
 - **License:** apache-2.0
+- **Finetuned from model:** ksw1/llama-3-8b-sleeper-agent
 - **Data that was used to train this model can be found on HuggingFace at:** [ksw1/cs224n-dpo-1](https://huggingface.co/datasets/ksw1/cs224n-dpo-1)
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.