Update README.md
Browse files
README.md
CHANGED
@@ -81,6 +81,7 @@ DPO was applied on "SungJoo/llama2-7b-sft-detox" with the following hyperparamet
|
|
81 |
## Objective
|
82 |
The main objective of this research is to reduce toxicity in LLMs by applying instruction tuning and Direct Preference Optimization (DPO).
|
83 |
A comprehensive instruction and DPO dataset was constructed for this purpose, which will be released in the future.
|
|
|
84 |
|
85 |
| **Model** | **LLaMA-2-base** | | **Finetuned LLaMA-2** | | **DPO LLaMA-2** | |
|
86 |
|--------------------|-------------------|-----------------------|-----------------------|-------------------------|-----------------------|-------------------------|
|
@@ -97,10 +98,8 @@ A comprehensive instruction and DPO dataset was constructed for this purpose, wh
|
|
97 |
| | | | <span style="color:blue;">(-0.34)</span> | <span style="color:blue;">(-333)</span> | <span style="color:green;">(-0.72)</span> | <span style="color:green;">(-723)</span> |
|
98 |
| **THREAT** | 1.43 | 1,424 | 0.92 | 919 | 0.76 | 754 |
|
99 |
| | | | <span style="color:blue;">(-0.51)</span> | <span style="color:blue;">(-505)</span> | <span style="color:green;">(-0.16)</span> | <span style="color:green;">(-165)</span> |
|
100 |
-
*Comparison of LLaMA-2-base, Finetuned LLaMA-2, and DPO LLaMA-2 across various categories.
|
101 |
-
Reductions in blue indicate comparisons between the base model and the fine-tuned model, while text in green represents comparisons between the fine-tuned model and the DPO model.*
|
102 |
|
103 |
-
The table above shows the effectiveness of this model in reducing bias, measured using the RealToxicityPrompt dataset and the Perspective API.
|
104 |
|
105 |
## Contact
|
106 |
For any questions or issues, please contact byunsj@snu.ac.kr.
|
|
|
81 |
## Objective
|
82 |
The main objective of this research is to reduce toxicity in LLMs by applying instruction tuning and Direct Preference Optimization (DPO).
|
83 |
A comprehensive instruction and DPO dataset was constructed for this purpose, which will be released in the future.
|
84 |
+
The table below shows the effectiveness of this model in reducing bias, measured using the RealToxicityPrompt dataset and the Perspective API.
|
85 |
|
86 |
| **Model** | **LLaMA-2-base** | | **Finetuned LLaMA-2** | | **DPO LLaMA-2** | |
|
87 |
|--------------------|-------------------|-----------------------|-----------------------|-------------------------|-----------------------|-------------------------|
|
|
|
98 |
| | | | <span style="color:blue;">(-0.34)</span> | <span style="color:blue;">(-333)</span> | <span style="color:green;">(-0.72)</span> | <span style="color:green;">(-723)</span> |
|
99 |
| **THREAT** | 1.43 | 1,424 | 0.92 | 919 | 0.76 | 754 |
|
100 |
| | | | <span style="color:blue;">(-0.51)</span> | <span style="color:blue;">(-505)</span> | <span style="color:green;">(-0.16)</span> | <span style="color:green;">(-165)</span> |
|
101 |
+
*Comparison of LLaMA-2-base, Finetuned LLaMA-2, and DPO LLaMA-2 across various categories. Reductions in blue indicate comparisons between the base model and the fine-tuned model, while text in green represents comparisons between the fine-tuned model and the DPO model.*
|
|
|
102 |
|
|
|
103 |
|
104 |
## Contact
|
105 |
For any questions or issues, please contact byunsj@snu.ac.kr.
|