CaasiHUANG commited on
Commit
552ff1d
1 Parent(s): 9788bb1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md CHANGED
@@ -1,3 +1,56 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - zh
5
+ metrics:
6
+ - accuracy
7
+ - recall
8
+ - precision
9
+ library_name: transformers
10
+ pipeline_tag: text-classification
11
  ---
12
+ # Flames-scorer
13
+
14
+ This is the specified scorer for Flames benchmark – a highly adversarial benchmark in Chinese for LLM's value alignment evaluation.
15
+ For more detail, please refer to our [paper](https://arxiv.org/abs/2311.06899) and [Github repo](https://github.com/AIFlames/Flames/tree/main)
16
+
17
+ ## Model Details
18
+ * Developed by: Shanghai AI Lab and Fudan NLP Group.
19
+ * Model type: We employ an InternLM-chat-7b as the backbone and build separate classifiers for each dimension on top of it. Then, we apply a multi-task training approach to train the scorer.
20
+ * Language(s): Chinese
21
+ * Paper: [FLAMES: Benchmarking Value Alignment of LLMs in Chinese](https://arxiv.org/abs/2311.06899)
22
+ * Contact: For questions and comments about the model, please email tengyan@pjlab.org.cn.
23
+
24
+ ## Usage
25
+
26
+ The environment can be set up as:
27
+ ```shell
28
+ $ pip install -r requirements.txt
29
+ ```
30
+ And you can use `infer.py` to evaluate your model:
31
+ ```shell
32
+ python infer.py --data_path YOUR_DATA_FILE.jsonl
33
+ ```
34
+ Please note that:
35
+ 1. Ensure each entry in `YOUR_DATA_FILE.jsonl` includes the fields: "dimension", "prompt", and "response".
36
+ 2. The predicted score will be stored in the "predicted" field, and the output will be saved in the same directory as `YOUR_DATA_FILE.jsonl`.
37
+ 3. The accuracy of the Flames-scorer on out-of-distribution prompts (i.e., prompts not included in the Flames-prompts) has not been evaluated. Consequently, its predictions for such data may not be reliable.
38
+
39
+ ## Citation
40
+ If you think this scorer is helpful, please cite the paper.
41
+ ```bibtex
42
+ @misc{huang2023flames,
43
+ title={Flames: Benchmarking Value Alignment of Chinese Large Language Models},
44
+ author={Kexin Huang and Xiangyang Liu and Qianyu Guo and Tianxiang Sun and Jiawei Sun and Yaru Wang and Zeyang Zhou and Yixu Wang and Yan Teng and Xipeng Qiu and Yingchun Wang and Dahua Lin},
45
+ year={2023},
46
+ eprint={2311.06899},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.CL}
49
+ }
50
+ ```
51
+
52
+ ---
53
+
54
+
55
+ license: apache-2.0
56
+ ---