File size: 4,579 Bytes
6e0c6d6
90d9239
 
9837b76
 
cae214c
90d9239
9837b76
6e0c6d6
90d9239
 
eb7df4e
80aaa07
 
 
90d9239
 
 
 
9620c5d
0ed55e5
2450e5b
0ed55e5
1261ba6
2450e5b
eac9cdf
0ed55e5
9620c5d
 
 
 
 
 
 
 
 
80aaa07
00f82d6
80aaa07
 
 
 
 
 
 
 
 
 
9620c5d
90d9239
 
9620c5d
90d9239
80aaa07
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
<!--
title: OpenFactCheck
emoji: ✅
colorFrom: green
colorTo: purple
sdk: streamlit
app_file: src/openfactcheck/app/app.py
pinned: false
-->

<p align="center">
  <img alt="OpenFactCheck Logo" src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/splash.png" height="120" />
  <p align="center">An Open-source Factuality Evaluation Demo for LLMs
    <br>
  </p>
</p>

---

<p align="center">
<a href="https://github.com/hasaniqbal777/OpenFactCheck/actions/workflows/release.yaml">
    <img src="https://img.shields.io/github/actions/workflow/status/hasaniqbal777/openfactcheck/release.yaml?logo=github&label=Release" alt="Release">
</a>
<a href="https://readthedocs.org/projects/openfactcheck/builds/">
    <img alt="Docs" src="https://img.shields.io/readthedocs/openfactcheck?logo=readthedocs&label=Docs">
</a>
<br>
<a href="https://opensource.org/licenses/Apache-2.0">
    <img src="https://img.shields.io/github/license/hasaniqbal777/openfactcheck" alt="License: Apache-2.0">
</a>
<a href="https://pypi.org/project/openfactcheck/">
    <img src="https://img.shields.io/pypi/pyversions/openfactcheck.svg" alt="Python Version">
</a>
<a href="https://pypi.org/project/openfactcheck/">
    <img src="https://img.shields.io/pypi/v/openfactcheck.svg" alt="PyPI Latest Release">
</a>
<a href="https://arxiv.org/abs/2405.05583"><img src="https://img.shields.io/badge/arXiv-2405.05583-B31B1B" alt="arXiv"></a>
<a href="https://zenodo.org/doi/10.5281/zenodo.13358664"><img src="https://img.shields.io/badge/DOI-10.5281/zenodo.13358664-blue" alt="DOI"></a>
</p>

---

<p align="center">
    <a href="#overview">Overview</a>
    <a href="#installation">Installation</a>
    <a href="#usage">Usage</a>
    <a href="https://huggingface.co/spaces/hasaniqbal777/OpenFactCheck">HuggingFace Demo</a>
    <a href="https://openfactcheck.readthedocs.io/">Documentation</a>
</p>

## Overview

OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines.

<img src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/architecture.png" width="100%">

## Installation

You can install the package from PyPI using pip:

```bash
pip install openfactcheck
```

## Usage

First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object.
```python
from openfactcheck import OpenFactCheck, OpenFactCheckConfig

# Initialize the OpenFactCheck object
config = OpenFactCheckConfig()
ofc = OpenFactCheck(config)
```

### Response Evaluation

You can evaluate a response using the `ResponseEvaluator` class.

```python
# Evaluate a response
result = ofc.ResponseEvaluator.evaluate(response: str)
```

### LLM Evaluation

We provide [FactQA](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/llm/questions.csv), a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the `LLMEvaluator` class.

```python
# Evaluate an LLM
result = ofc.LLMEvaluator.evaluate(model_name: str,
                                   input_path: str)
```

### Checker Evaluation

We provide [FactBench](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/factchecker/claims.jsonl), a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the `CheckerEvaluator` class.

```python
# Evaluate a fact-checker
result = ofc.CheckerEvaluator.evaluate(checker_name: str,
                                       input_path: str)
```

## Cite

If you use OpenFactCheck in your research, please cite the following:

```bibtex
@article{wang2024openfactcheck,
  title        = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
  author       = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav},
  journal      = {arXiv preprint arXiv:2405.05583},
  year         = {2024}
}

@software{hasan_iqbal_2024_13358665,
  author       = {Hasan Iqbal},
  title        = {hasaniqbal777/OpenFactCheck: v0.3.0},
  month        = {aug},
  year         = {2024},
  publisher    = {Zenodo},
  version      = {v0.3.0},
  doi          = {10.5281/zenodo.13358665},
  url          = {https://doi.org/10.5281/zenodo.13358665}
}
```