Add link to paper, add pipeline tag
#1
by
nielsr
HF staff
- opened
README.md
CHANGED
@@ -1,5 +1,6 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
3 |
base_model:
|
4 |
- Qwen/Qwen2.5-0.5B
|
5 |
---
|
@@ -12,6 +13,9 @@ A brief description of what this model does and how it’s unique or relevant:
|
|
12 |
For simplified binary moderation tasks, the model can be used to produce a single “safe”/“unsafe” label by taking the maximum of the 12 subcategory probabilities and comparing it to a given threshold (e.g., 0.5). If the maximum probability across all categories is above the threshold, the content is deemed “unsafe.” Otherwise, it is considered “safe.”
|
13 |
|
14 |
DuoGuard-0.5B is built upon Qwen 2.5 (0.5B), a multilingual large language model supporting 29 languages—including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic. DuoGuard-0.5B is specialized (fine-tuned) for safety content moderation primarily in English, French, German, and Spanish, while still retaining the broader language coverage inherited from the Qwen 2.5 base model. It is provided with open weights.
|
|
|
|
|
|
|
15 |
## How to Use
|
16 |
A quick code snippet or set of instructions on how to load and use the model in an application:
|
17 |
```python
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
pipeline_tag: text-classification
|
4 |
base_model:
|
5 |
- Qwen/Qwen2.5-0.5B
|
6 |
---
|
|
|
13 |
For simplified binary moderation tasks, the model can be used to produce a single “safe”/“unsafe” label by taking the maximum of the 12 subcategory probabilities and comparing it to a given threshold (e.g., 0.5). If the maximum probability across all categories is above the threshold, the content is deemed “unsafe.” Otherwise, it is considered “safe.”
|
14 |
|
15 |
DuoGuard-0.5B is built upon Qwen 2.5 (0.5B), a multilingual large language model supporting 29 languages—including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic. DuoGuard-0.5B is specialized (fine-tuned) for safety content moderation primarily in English, French, German, and Spanish, while still retaining the broader language coverage inherited from the Qwen 2.5 base model. It is provided with open weights.
|
16 |
+
|
17 |
+
-It is presented in the paper [DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails](https://huggingface.co/papers/arXiv:2502.05163)
|
18 |
+
|
19 |
## How to Use
|
20 |
A quick code snippet or set of instructions on how to load and use the model in an application:
|
21 |
```python
|