bond005 commited on
Commit
68a37b5
1 Parent(s): 191d1d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +185 -0
README.md CHANGED
@@ -1,3 +1,188 @@
1
  ---
 
 
 
2
  license: apache-2.0
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - multilingual
4
+ - en
5
  license: apache-2.0
6
+ pipeline_tag: text-classification
7
  ---
8
+
9
+ # LLM hallucination detector
10
+
11
+ The LLM hallucination detector based on the hierarchical [XLM-RoBERTa-XL](https://huggingface.co/facebook/xlm-roberta-xl) was developed to participate in the [SemEval-2024 Task-6 - SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes](https://helsinki-nlp.github.io/shroom) (model-agnostic track).
12
+
13
+ ## Model description
14
+
15
+ Text...
16
+
17
+ ## Intended uses & limitations
18
+
19
+ This model is primarily aimed at being reference-based detected of hallucination in LLM without any additional information about LLM type and architecture (i.e. in model-agnostic mode). The reference-based detection means that the hallucination detector considers not only the human question and the answer generated by the verified LLM, but also the reference answer to the human question. Therefore, in a situation where the reference answer is not known, this hallucination detector is not applicable. But in some cases (for example, when we analyze the LLM's responses on an annotated test set and want to separate hallucinations from usual errors such as undergeneration, errors related to part of speech, and so on), we know information about the standards, and then the proposed detector will be extremely useful.
20
+
21
+ ## Usage
22
+
23
+ You need to install the [pytorch-metric-library](https://github.com/KevinMusgrave/pytorch-metric-learning) to use this model. After that, you can use this model directly with a pipeline for text classification:
24
+
25
+ ```python
26
+ from typing import Dict
27
+
28
+ from transformers import pipeline
29
+ import torch
30
+
31
+
32
+ def sample_to_str(sample: Dict[str, str]) -> str:
33
+ """ It converts a datapoint to an input text for an encoder-based classifier (like as RoBERTa).
34
+ :param sample: the datapoint
35
+ :return: the input text for the classifier (i.e. the LLM hallucination detector).
36
+ """
37
+ possible_tasks = {
38
+ 'PG', # paraphrase generation
39
+ 'MT', # machine translation
40
+ 'DM', # definition modeling
41
+ }
42
+ checked_llm_prediction = ' '.join(sample['hyp'].strip().split())
43
+ llm_task = sample['task']
44
+ if llm_task not in possible_tasks:
45
+ raise ValueError(f'The task {llm_task} is not supported!')
46
+ if llm_task == 'PG':
47
+ context = ' '.join(sample['src'].strip().split())
48
+ united_prompt = 'The verified system\'s task is a paraphrase generation.'
49
+ else:
50
+ context = ' '.join(sample['tgt'].strip().split())
51
+ if llm_task== 'MT':
52
+ united_prompt = 'The verified system\'s task is a machine translation.'
53
+ else:
54
+ united_prompt = 'The verified system\'s task is a definition modeling.'
55
+ united_prompt += ' The sentence generated by the verified system: '
56
+ united_prompt += checked_llm_prediction
57
+ if united_prompt[-1].isalnum():
58
+ united_prompt += '.'
59
+ united_prompt += f' The generation context: {context}'
60
+ if united_prompt[-1].isalnum():
61
+ united_prompt += '.'
62
+ return united_prompt
63
+
64
+
65
+ # The input data format is based on data for the model-agnostic track of SHROOM
66
+ # https://helsinki-nlp.github.io/shroom
67
+ input_data = [
68
+ {
69
+ "hyp": "Resembling or characteristic of a weasel.",
70
+ "ref": "tgt",
71
+ "src": "The writer had just entered into his eighteenth year , when he met at the table of a certain Anglo - Germanist an individual , apparently somewhat under thirty , of middle stature , a thin and <define> weaselly </define> figure , a sallow complexion , a certain obliquity of vision , and a large pair of spectacles .",
72
+ "tgt": "Resembling a weasel (in appearance).",
73
+ "model": "",
74
+ "task": "DM",
75
+ "labels": [
76
+ "Hallucination",
77
+ "Not Hallucination",
78
+ "Not Hallucination",
79
+ "Not Hallucination",
80
+ "Not Hallucination"
81
+ ],
82
+ "label": "Not Hallucination",
83
+ "p(Hallucination)": 0.2
84
+ },
85
+ {
86
+ "hyp": "I thought you'd be surprised at me too.",
87
+ "ref": "either",
88
+ "src": "I thought so, too.",
89
+ "tgt": "That was my general impression as well.",
90
+ "model": "",
91
+ "task": "PG",
92
+ "labels": [
93
+ "Hallucination",
94
+ "Hallucination",
95
+ "Hallucination",
96
+ "Hallucination",
97
+ "Hallucination"
98
+ ],
99
+ "label": "Hallucination",
100
+ "p(Hallucination)": 1.0
101
+ },
102
+ {
103
+ "hyp": "You can go with me perfectly.",
104
+ "ref": "either",
105
+ "src": "Ты вполне можешь пойти со мной.",
106
+ "tgt": "You may as well come with me.",
107
+ "model": "",
108
+ "task": "MT",
109
+ "labels": [
110
+ "Not Hallucination",
111
+ "Hallucination",
112
+ "Hallucination",
113
+ "Not Hallucination",
114
+ "Hallucination"
115
+ ],
116
+ "label": "Hallucination",
117
+ "p(Hallucination)": 0.6
118
+ }
119
+ ]
120
+
121
+ hallucination_detector = pipeline(
122
+ task='text-classification',
123
+ model='bond005/xlm-roberta-xl-hallucination-detector',
124
+ framework='pt', trust_remote_code=True, device='cuda', torch_dtype=torch.float16
125
+ )
126
+
127
+ for sample in input_data:
128
+ input_prompt = sample_to_str(sample)
129
+ print('')
130
+ print('==========')
131
+ print(f' Task: {sample["task"]}')
132
+ print(' Question for detector:')
133
+ print(input_prompt)
134
+ print('==========')
135
+ print('TRUE')
136
+ print(f' label: {sample["label"]}')
137
+ print(f' p(Hallucination): {round(sample["p(Hallucination)"], 3)}')
138
+ prediction = hallucination_detector(input_prompt)[0]
139
+ predicted_label = prediction['label']
140
+ if predicted_label == 'Hallucination':
141
+ hallucination_probability = prediction['score']
142
+ else:
143
+ hallucination_probability = 1.0 - prediction['score']
144
+ print('PREDICTED')
145
+ print(f' label: {predicted_label}')
146
+ print(f' p(Hallucination): {round(hallucination_probability, 3)}')
147
+ ```
148
+
149
+ ```text
150
+
151
+ ==========
152
+ Task: DM
153
+ Question for detector:
154
+ The verified system's task is a definition modeling. The sentence generated by the verified system: Resembling or characteristic of a weasel. The generation context: Resembling a weasel (in appearance).
155
+ ==========
156
+ TRUE
157
+ label: Not Hallucination
158
+ p(Hallucination): 0.2
159
+ PREDICTED
160
+ label: Not Hallucination
161
+ p(Hallucination): 0.297
162
+
163
+ ==========
164
+ Task: PG
165
+ Question for detector:
166
+ The verified system's task is a paraphrase generation. The sentence generated by the verified system: I thought you'd be surprised at me too. The generation context: I thought so, too.
167
+ ==========
168
+ TRUE
169
+ label: Hallucination
170
+ p(Hallucination): 1.0
171
+ PREDICTED
172
+ label: Hallucination
173
+ p(Hallucination): 0.563
174
+
175
+ ==========
176
+ Task: MT
177
+ Question for detector:
178
+ The verified system's task is a machine translation. The sentence generated by the verified system: You can go with me perfectly. The generation context: You may as well come with me.
179
+ ==========
180
+ TRUE
181
+ label: Hallucination
182
+ p(Hallucination): 0.6
183
+ PREDICTED
184
+ label: Not Hallucination
185
+ p(Hallucination): 0.487
186
+ ```
187
+
188
+ The Google Colaboratory version of [this script](https://colab.research.google.com/drive/1T5LOuYfLNI3bqz6W-Y6kEajk3SumxyqU?usp=sharing) is available too.