LLM360
/

AmberSafe

@@ -76,29 +76,14 @@ python3 -m fastchat.serve.cli --model-path LLM360/AmberSafe
 ## DataMix
 | Subset      | Number of rows |  License   |
 | ----------- | ----------- | ----------- |
-| PKU-Alignment/PKU-SafeRLHF    | 30k        | cc-by-nc-4.0 |
-| Total | 30k |  |
-## Hyperparameters
-| Hyperparameter      | Value |
-| ----------- | ----------- |
-| Total Parameters      | 6.7B       |
-| Hidden Size   | 4096        |
-| Intermediate Size (MLPs)   | 11008        |
-| Number of Attention Heads   | 32        |
-| Number of Hidden Lyaers  | 32        |
-| RMSNorm ɛ  | 1e^-6        |
-| Max Seq Length   | 2048        |
-| Vocab Size | 32000 |
-| Training Hyperparameter      | Value |
-| ----------- | ----------- |
-| learning_rate      | 2e-5       |
-| num_train_epochs  |  3        |
-| per_device_train_batch_size   | 2        |
-| gradient_accumulation_steps  | 16        |
-| warmup_ratio | 0.04      |
-| model_max_length | 2048     |
 # Evaluation
@@ -107,7 +92,7 @@ python3 -m fastchat.serve.cli --model-path LLM360/AmberSafe
 |------------------------------------------------------|------------------------------------------------------------|
 | LLM360/Amber 359 | 2.48750 |
 | LLM360/AmberChat | 5.428125 |
-| **LLM360/AmberSafe** | **0.00000** |
 # Citation

 ## DataMix
 | Subset      | Number of rows |  License   |
 | ----------- | ----------- | ----------- |
+| [PKU-Alignment/PKU-SafeRLHF](https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF)    | 330k        | cc-by-nc-4.0 |
+| Total | 330k |  |
+## Method
+We followed the instructions in the [dpo repo](https://github.com/eric-mitchell/direct-preference-optimization) to finetune this model.
+1. Run supervised fine-tuning (SFT) on the dataset(s) of interest.
+2. Run preference learning on the model from step 1, using preference data (ideally from the same distribution as the SFT examples).
 # Evaluation
 |------------------------------------------------------|------------------------------------------------------------|
 | LLM360/Amber 359 | 2.48750 |
 | LLM360/AmberChat | 5.428125 |
+| **LLM360/AmberSafe** | **4.971264** |
 # Citation