normster commited on
Commit
4cac1fd
·
verified ·
1 Parent(s): 896fdc9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -5,11 +5,12 @@ datasets:
5
  base_model:
6
  - meta-llama/Llama-3.1-8B-Instruct
7
  - normster/RealGuardrails-Llama3.1-8B-Instruct-SFT
 
8
  ---
9
 
10
  # RealGuardrails Models
11
 
12
- This model was trained on the [RealGuardrails](https://huggingface.co/datasets/normster/RealGuardrails) dataset, an instruction-tuning dataset focused on improving system prompt adherence and precedence. In particular, it was trained via SFT on the `systemmix` split (150K examples) using our custom training library [torchllms](https://github.com/normster/torchllms) (yielding [normster/RealGuardrails-Llama3.1-8B-Instruct-SFT](https://huggingface.co/normster/RealGuardrails-Llama3.1-8B-Instruct-SFT)), and then trained via DPO on the `preferencemix` split (30K examples).
13
 
14
  ## Training Hyperparameters
15
 
 
5
  base_model:
6
  - meta-llama/Llama-3.1-8B-Instruct
7
  - normster/RealGuardrails-Llama3.1-8B-Instruct-SFT
8
+ library_name: transformers
9
  ---
10
 
11
  # RealGuardrails Models
12
 
13
+ This model was trained on the [RealGuardrails](https://huggingface.co/datasets/normster/RealGuardrails) dataset, an instruction-tuning dataset focused on improving system prompt adherence and precedence. In particular, it was trained via SFT on the `systemmix` split (150K examples) using our custom training library [torchllms](https://github.com/normster/torchllms) (yielding [normster/RealGuardrails-Llama3.1-8B-Instruct-SFT](https://huggingface.co/normster/RealGuardrails-Llama3.1-8B-Instruct-SFT)), and then trained via DPO on the `preferencemix` split (30K examples), and converted back to a `transformers` compatible checkpoint.
14
 
15
  ## Training Hyperparameters
16