Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -5,11 +5,12 @@ datasets:
|
|
5 |
base_model:
|
6 |
- meta-llama/Llama-3.1-8B-Instruct
|
7 |
- normster/RealGuardrails-Llama3.1-8B-Instruct-SFT
|
|
|
8 |
---
|
9 |
|
10 |
# RealGuardrails Models
|
11 |
|
12 |
-
This model was trained on the [RealGuardrails](https://huggingface.co/datasets/normster/RealGuardrails) dataset, an instruction-tuning dataset focused on improving system prompt adherence and precedence. In particular, it was trained via SFT on the `systemmix` split (150K examples) using our custom training library [torchllms](https://github.com/normster/torchllms) (yielding [normster/RealGuardrails-Llama3.1-8B-Instruct-SFT](https://huggingface.co/normster/RealGuardrails-Llama3.1-8B-Instruct-SFT)), and then trained via DPO on the `preferencemix` split (30K examples).
|
13 |
|
14 |
## Training Hyperparameters
|
15 |
|
|
|
5 |
base_model:
|
6 |
- meta-llama/Llama-3.1-8B-Instruct
|
7 |
- normster/RealGuardrails-Llama3.1-8B-Instruct-SFT
|
8 |
+
library_name: transformers
|
9 |
---
|
10 |
|
11 |
# RealGuardrails Models
|
12 |
|
13 |
+
This model was trained on the [RealGuardrails](https://huggingface.co/datasets/normster/RealGuardrails) dataset, an instruction-tuning dataset focused on improving system prompt adherence and precedence. In particular, it was trained via SFT on the `systemmix` split (150K examples) using our custom training library [torchllms](https://github.com/normster/torchllms) (yielding [normster/RealGuardrails-Llama3.1-8B-Instruct-SFT](https://huggingface.co/normster/RealGuardrails-Llama3.1-8B-Instruct-SFT)), and then trained via DPO on the `preferencemix` split (30K examples), and converted back to a `transformers` compatible checkpoint.
|
14 |
|
15 |
## Training Hyperparameters
|
16 |
|