Vily1998
commited on
Commit
•
0314fc0
1
Parent(s):
2485ec4
init
Browse files- ._tokenizer.json +0 -0
- README.md +34 -0
- truthx_results.png +0 -0
._tokenizer.json
DELETED
Binary file (4.1 kB)
|
|
README.md
CHANGED
@@ -1,3 +1,37 @@
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
---
|
4 |
+
# TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
|
5 |
+
|
6 |
+
> [Shaolei Zhang](https://zhangshaolei1998.github.io/), Tian Yu, [Yang Feng](https://people.ucas.edu.cn/~yangfeng?language=en)*
|
7 |
+
|
8 |
+
**TruthX** is an inference-time method to elicit the truthfulness of LLMs by editing their internal representations in truthful space, thereby mitigating the hallucinations of LLMs. On the [TruthfulQA benchmark](https://paperswithcode.com/sota/question-answering-on-truthfulqa), TruthX yields an average **enhancement of 20% in truthfulness** across 13 advanced LLMs.
|
9 |
+
|
10 |
+
<div align="center">
|
11 |
+
<img src="./truthx_results.png" alt="img" width="100%" />
|
12 |
+
</div>
|
13 |
+
<p align="center">
|
14 |
+
TruthfulQA MC1 accuracy of TruthX across 13 advanced LLMs
|
15 |
+
</p>
|
16 |
+
|
17 |
+
This repo provide **Llama-2-7B-Chat-TruthX**, a Llama-2-7B-Chat model with baked-in TruthX model. You can directly download this baked-in model and use it like standard Llama, no additional operations are required.
|
18 |
+
|
19 |
+
## Quick Starts
|
20 |
+
Inference with Llama-2-7B-Chat-TruthX:
|
21 |
+
|
22 |
+
```python
|
23 |
+
import torch
|
24 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
25 |
+
|
26 |
+
llama2chat_with_truthx = "ICTNLP/Llama-2-7b-chat-TruthX"
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained(llama2chat_with_truthx, trust_remote_code=True)
|
28 |
+
model = AutoModelForCausalLM.from_pretrained(llama2chat_with_truthx, trust_remote_code=True,torch_dtype=torch.float16).cuda()
|
29 |
+
|
30 |
+
question = "What are the benefits of eating an apple a day?"
|
31 |
+
encoded_inputs = tokenizer(question, return_tensors="pt")["input_ids"]
|
32 |
+
outputs = model.generate(encoded_inputs.cuda())[0, encoded_inputs.shape[-1] :]
|
33 |
+
outputs_text = tokenizer.decode(outputs, skip_special_tokens=True).strip()
|
34 |
+
print(outputs_text)
|
35 |
+
```
|
36 |
+
|
37 |
+
Please refer to [GitHub repo](https://github.com/ictnlp/TruthX) and our paper for more details.
|
truthx_results.png
ADDED