nonstopfor
commited on
Commit
•
109523f
1
Parent(s):
f300599
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
- zh
|
6 |
+
---
|
7 |
+
## Introduction
|
8 |
+
The ShieldLM model ([paper link](xxx)) initialized from [internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b). ShieldLM is a bilingual (Chinese and English) safety detector that mainly aims to help to detect safety issues in LLMs' generations. It aligns with general human safety standards, supports fine-grained customizable detection rules, and provides explanations for its decisions.
|
9 |
+
Refer to our [github repository](https://github.com/thu-coai/ShieldLM) for more detailed information.
|
10 |
+
|
11 |
+
## Usage
|
12 |
+
Please refer to our [github repository](https://github.com/thu-coai/ShieldLM) for the detailed usage instructions.
|
13 |
+
|
14 |
+
## Performance
|
15 |
+
ShieldLM demonstrates impressive detection performance across 4 ID and OOD test sets, compared to strong baselines such as GPT-4, Llama Guard and Perspective API.
|
16 |
+
Refer to [our paper](xxx) for more detailed evaluation results.
|