File size: 4,915 Bytes
b163a7c 90d9239 9837b76 cae214c 90d9239 9837b76 b163a7c 90d9239 eb7df4e 80aaa07 90d9239 9620c5d 0ed55e5 2450e5b 0ed55e5 1261ba6 2450e5b eac9cdf 0ed55e5 5574116 9620c5d 80aaa07 00f82d6 80aaa07 9620c5d 90d9239 9620c5d 90d9239 80aaa07 2149f0a 80aaa07 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
---
title: OpenFactCheck Prelease
emoji: ✅
colorFrom: green
colorTo: purple
sdk: streamlit
app_file: src/openfactcheck/app/app.py
pinned: false
---
<p align="center">
<img alt="OpenFactCheck Logo" src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/splash.png" height="120" />
<p align="center">An Open-source Factuality Evaluation Demo for LLMs
<br>
</p>
</p>
---
<p align="center">
<a href="https://github.com/hasaniqbal777/OpenFactCheck/actions/workflows/release.yaml">
<img src="https://img.shields.io/github/actions/workflow/status/hasaniqbal777/openfactcheck/release.yaml?logo=github&label=Release" alt="Release">
</a>
<a href="https://readthedocs.org/projects/openfactcheck/builds/">
<img alt="Docs" src="https://img.shields.io/readthedocs/openfactcheck?logo=readthedocs&label=Docs">
</a>
<br>
<a href="https://gnu.org/licenses/gpl-3.0.html">
<img src="https://img.shields.io/github/license/hasaniqbal777/openfactcheck" alt="License">
</a>
<a href="https://pypi.org/project/openfactcheck/">
<img src="https://img.shields.io/pypi/pyversions/openfactcheck.svg" alt="Python Version">
</a>
<a href="https://pypi.org/project/openfactcheck/">
<img src="https://img.shields.io/pypi/v/openfactcheck.svg" alt="PyPI Latest Release">
</a>
<a href="https://arxiv.org/abs/2405.05583"><img src="https://img.shields.io/badge/arXiv-2405.05583-B31B1B" alt="arXiv"></a>
<a href="https://zenodo.org/doi/10.5281/zenodo.13358664"><img src="https://img.shields.io/badge/DOI-10.5281/zenodo.13358664-blue" alt="DOI"></a>
</p>
---
<p align="center">
<a href="#overview">Overview</a> •
<a href="#installation">Installation</a> •
<a href="#usage">Usage</a> •
<a href="https://huggingface.co/spaces/hasaniqbal777/OpenFactCheck">HuggingFace Demo</a> •
<a href="https://openfactcheck.readthedocs.io/">Documentation</a>
</p>
## Overview
OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines.
<img src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/architecture.png" width="100%">
## Installation
You can install the package from PyPI using pip:
```bash
pip install openfactcheck
```
## Usage
First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object.
```python
from openfactcheck import OpenFactCheck, OpenFactCheckConfig
# Initialize the OpenFactCheck object
config = OpenFactCheckConfig()
ofc = OpenFactCheck(config)
```
### Response Evaluation
You can evaluate a response using the `ResponseEvaluator` class.
```python
# Evaluate a response
result = ofc.ResponseEvaluator.evaluate(response: str)
```
### LLM Evaluation
We provide [FactQA](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/llm/questions.csv), a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the `LLMEvaluator` class.
```python
# Evaluate an LLM
result = ofc.LLMEvaluator.evaluate(model_name: str,
input_path: str)
```
### Checker Evaluation
We provide [FactBench](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/factchecker/claims.jsonl), a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the `CheckerEvaluator` class.
```python
# Evaluate a fact-checker
result = ofc.CheckerEvaluator.evaluate(checker_name: str,
input_path: str)
```
## Cite
If you use OpenFactCheck in your research, please cite the following:
```bibtex
@article{wang2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav},
journal = {arXiv preprint arXiv:2405.05583},
year = {2024}
}
@article{iqbal2024openfactcheck,
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs},
author = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav},
journal = {arXiv preprint arXiv:2408.11832},
year = {2024}
}
@software{hasan_iqbal_2024_13358665,
author = {Hasan Iqbal},
title = {hasaniqbal777/OpenFactCheck: v0.3.0},
month = {aug},
year = {2024},
publisher = {Zenodo},
version = {v0.3.0},
doi = {10.5281/zenodo.13358665},
url = {https://doi.org/10.5281/zenodo.13358665}
}
```
|