File size: 4,915 Bytes
b163a7c
 
90d9239
9837b76
 
cae214c
90d9239
9837b76
b163a7c
90d9239
 
eb7df4e
80aaa07
 
 
90d9239
 
 
 
9620c5d
0ed55e5
2450e5b
0ed55e5
1261ba6
2450e5b
eac9cdf
0ed55e5
5574116
 
9620c5d
 
 
 
 
 
 
80aaa07
00f82d6
80aaa07
 
 
 
 
 
 
 
 
 
9620c5d
90d9239
 
9620c5d
90d9239
80aaa07
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2149f0a
 
 
 
 
 
 
80aaa07
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
title: OpenFactCheck Prelease
emoji: 
colorFrom: green
colorTo: purple
sdk: streamlit
app_file: src/openfactcheck/app/app.py
pinned: false
---

<p align="center">
  <img alt="OpenFactCheck Logo" src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/splash.png" height="120" />
  <p align="center">An Open-source Factuality Evaluation Demo for LLMs
    <br>
  </p>
</p>

---

<p align="center">
<a href="https://github.com/hasaniqbal777/OpenFactCheck/actions/workflows/release.yaml">
    <img src="https://img.shields.io/github/actions/workflow/status/hasaniqbal777/openfactcheck/release.yaml?logo=github&label=Release" alt="Release">
</a>
<a href="https://readthedocs.org/projects/openfactcheck/builds/">
    <img alt="Docs" src="https://img.shields.io/readthedocs/openfactcheck?logo=readthedocs&label=Docs">
</a>
<br>
<a href="https://gnu.org/licenses/gpl-3.0.html">
    <img src="https://img.shields.io/github/license/hasaniqbal777/openfactcheck" alt="License">
</a>
<a href="https://pypi.org/project/openfactcheck/">
    <img src="https://img.shields.io/pypi/pyversions/openfactcheck.svg" alt="Python Version">
</a>
<a href="https://pypi.org/project/openfactcheck/">
    <img src="https://img.shields.io/pypi/v/openfactcheck.svg" alt="PyPI Latest Release">
</a>
<a href="https://arxiv.org/abs/2405.05583"><img src="https://img.shields.io/badge/arXiv-2405.05583-B31B1B" alt="arXiv"></a>
<a href="https://zenodo.org/doi/10.5281/zenodo.13358664"><img src="https://img.shields.io/badge/DOI-10.5281/zenodo.13358664-blue" alt="DOI"></a>
</p>

---

<p align="center">
    <a href="#overview">Overview</a> •
    <a href="#installation">Installation</a> •
    <a href="#usage">Usage</a> •
    <a href="https://huggingface.co/spaces/hasaniqbal777/OpenFactCheck">HuggingFace Demo</a> •
    <a href="https://openfactcheck.readthedocs.io/">Documentation</a>
</p>

## Overview

OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines.

<img src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/architecture.png" width="100%">

## Installation

You can install the package from PyPI using pip:

```bash
pip install openfactcheck
```

## Usage

First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object.
```python
from openfactcheck import OpenFactCheck, OpenFactCheckConfig

# Initialize the OpenFactCheck object
config = OpenFactCheckConfig()
ofc = OpenFactCheck(config)
```

### Response Evaluation

You can evaluate a response using the `ResponseEvaluator` class.

```python
# Evaluate a response
result = ofc.ResponseEvaluator.evaluate(response: str)
```

### LLM Evaluation

We provide [FactQA](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/llm/questions.csv), a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the `LLMEvaluator` class.

```python
# Evaluate an LLM
result = ofc.LLMEvaluator.evaluate(model_name: str,
                                   input_path: str)
```

### Checker Evaluation

We provide [FactBench](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/factchecker/claims.jsonl), a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the `CheckerEvaluator` class.

```python
# Evaluate a fact-checker
result = ofc.CheckerEvaluator.evaluate(checker_name: str,
                                       input_path: str)
```

## Cite

If you use OpenFactCheck in your research, please cite the following:

```bibtex
@article{wang2024openfactcheck,
  title        = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
  author       = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav},
  journal      = {arXiv preprint arXiv:2405.05583},
  year         = {2024}
}

@article{iqbal2024openfactcheck,
  title        = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
  author       = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav},
  journal      = {arXiv preprint arXiv:2408.11832},
  year         = {2024}
}

@software{hasan_iqbal_2024_13358665,
  author       = {Hasan Iqbal},
  title        = {hasaniqbal777/OpenFactCheck: v0.3.0},
  month        = {aug},
  year         = {2024},
  publisher    = {Zenodo},
  version      = {v0.3.0},
  doi          = {10.5281/zenodo.13358665},
  url          = {https://doi.org/10.5281/zenodo.13358665}
}
```