anakin87 commited on
Commit
cbd0b83
β€’
1 Parent(s): 97688d7

extract entailment_checker

Browse files
README.md CHANGED
@@ -27,6 +27,8 @@ license: apache-2.0
27
  - [Limits and possible improvements](#limits-and-possible-improvements)
28
  - [Repository structure](#repository-structure)
29
  - [Installation](#installation)
 
 
30
 
31
  ### Idea
32
  πŸ’‘ This project aims to show that a *naive and simple baseline* for fact checking can be built by combining dense retrieval and a textual entailment task.
@@ -42,7 +44,7 @@ In a nutshell, the flow is as follows:
42
  - [πŸ§‘β€πŸ« Slides](./presentation/fact_checking_rocks.pdf)
43
 
44
  ### System description
45
- πŸͺ„ This project is strongly based on [πŸ”Ž Haystack](https://github.com/deepset-ai/haystack), an open source NLP framework to realize search system. The main components of our system are an indexing pipeline and a search pipeline.
46
 
47
  #### Indexing pipeline
48
  * [Crawling](https://github.com/anakin87/fact-checking-rocks/blob/321ba7893bbe79582f8c052493acfda497c5b785/notebooks/get_wikipedia_data.ipynb): Crawl data from Wikipedia, starting from the page [List of mainstream rock performers](https://en.wikipedia.org/wiki/List_of_mainstream_rock_performers) and using the [python wrapper](https://github.com/goldsmith/Wikipedia)
@@ -58,7 +60,8 @@ In a nutshell, the flow is as follows:
58
  * the user enters a factual statement
59
  * compute the embedding of the user statement using the same Sentence Transformer used for indexing (`msmarco-distilbert-base-tas-b`)
60
  * retrieve the K most relevant text passages stored in FAISS (along with their relevance scores)
61
- * **text entailment task**: compute the text entailment between each text passage (premise) and the user statement (hypotesis), using a Natural Language Inference model (`microsoft/deberta-v2-xlarge-mnli`). For every text passage, we have 3 scores (summing to 1): entailment, contradiction and neutral. *(For this task, I developed a custom Haystack node: `EntailmentChecker`)*
 
62
  * aggregate the text entailment scores: compute the weighted average of them, where the weight is the relevance score. **Now it is possible to tell if the knowledge base confirms, is neutral or disproves the user statement.**
63
  * *empirical consideration: if in the first N passages (N<K), there is strong evidence of entailment/contradiction (partial aggregate scores > 0.5), it is better not to consider (K-N) less relevant documents.*
64
 
@@ -83,7 +86,15 @@ While keeping this simple approach, some **improvements** could be made:
83
  * [data folder](./data/): all necessary data, including original Wikipedia data, FAISS Index and prepared random statements
84
 
85
  ### Installation
86
- πŸ’» To install this project locally, follow these steps:
 
 
 
 
 
 
 
 
87
  * `git clone https://github.com/anakin87/fact-checking-rocks`
88
  * `cd fact-checking-rocks`
89
  * `pip install -r requirements.txt`
 
27
  - [Limits and possible improvements](#limits-and-possible-improvements)
28
  - [Repository structure](#repository-structure)
29
  - [Installation](#installation)
30
+ - [Entailment Checker node](#entailment-checker-node)
31
+ - [Fact Checking 🎸 Rocks!](#fact-checking--rocks)
32
 
33
  ### Idea
34
  πŸ’‘ This project aims to show that a *naive and simple baseline* for fact checking can be built by combining dense retrieval and a textual entailment task.
 
44
  - [πŸ§‘β€πŸ« Slides](./presentation/fact_checking_rocks.pdf)
45
 
46
  ### System description
47
+ πŸͺ„ This project is strongly based on [πŸ”Ž Haystack](https://github.com/deepset-ai/haystack), an open source NLP framework that enables seamless use of Transformer models and LLMs to interact with your data. The main components of our system are an indexing pipeline and a search pipeline.
48
 
49
  #### Indexing pipeline
50
  * [Crawling](https://github.com/anakin87/fact-checking-rocks/blob/321ba7893bbe79582f8c052493acfda497c5b785/notebooks/get_wikipedia_data.ipynb): Crawl data from Wikipedia, starting from the page [List of mainstream rock performers](https://en.wikipedia.org/wiki/List_of_mainstream_rock_performers) and using the [python wrapper](https://github.com/goldsmith/Wikipedia)
 
60
  * the user enters a factual statement
61
  * compute the embedding of the user statement using the same Sentence Transformer used for indexing (`msmarco-distilbert-base-tas-b`)
62
  * retrieve the K most relevant text passages stored in FAISS (along with their relevance scores)
63
+ * the following steps are performed using the [`EntailmentChecker`, a custom Haystack node](https://github.com/anakin87/haystack-entailment-checker)
64
+ * **text entailment task**: compute the text entailment between each text passage (premise) and the user statement (hypothesis), using a Natural Language Inference model (`microsoft/deberta-v2-xlarge-mnli`). For every text passage, we have 3 scores (summing to 1): entailment, contradiction and neutral.
65
  * aggregate the text entailment scores: compute the weighted average of them, where the weight is the relevance score. **Now it is possible to tell if the knowledge base confirms, is neutral or disproves the user statement.**
66
  * *empirical consideration: if in the first N passages (N<K), there is strong evidence of entailment/contradiction (partial aggregate scores > 0.5), it is better not to consider (K-N) less relevant documents.*
67
 
 
86
  * [data folder](./data/): all necessary data, including original Wikipedia data, FAISS Index and prepared random statements
87
 
88
  ### Installation
89
+ πŸ’»
90
+ #### Entailment Checker node
91
+ If you want to build a similar system using the [`EntailmentChecker`](https://github.com/anakin87/haystack-entailment-checker), I strongly suggest taking a look at [the node repository](https://github.com/anakin87/haystack-entailment-checker). It can be easily installed with
92
+ ```bash
93
+ pip install haystack-entailment-checker
94
+ ```
95
+
96
+ #### Fact Checking 🎸 Rocks!
97
+ To install this project locally, follow these steps:
98
  * `git clone https://github.com/anakin87/fact-checking-rocks`
99
  * `cd fact-checking-rocks`
100
  * `pip install -r requirements.txt`
app_utils/backend_utils.py CHANGED
@@ -7,7 +7,7 @@ from haystack.nodes import EmbeddingRetriever, PromptNode
7
  from haystack.pipelines import Pipeline
8
  import streamlit as st
9
 
10
- from app_utils.entailment_checker import EntailmentChecker
11
  from app_utils.config import (
12
  STATEMENTS_PATH,
13
  INDEX_DIR,
 
7
  from haystack.pipelines import Pipeline
8
  import streamlit as st
9
 
10
+ from haystack_entailment_checker import EntailmentChecker
11
  from app_utils.config import (
12
  STATEMENTS_PATH,
13
  INDEX_DIR,
app_utils/entailment_checker.py DELETED
@@ -1,126 +0,0 @@
1
- from typing import List, Optional
2
-
3
- from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
4
- import torch
5
- from haystack.nodes.base import BaseComponent
6
- from haystack.modeling.utils import initialize_device_settings
7
- from haystack.schema import Document
8
-
9
-
10
- class EntailmentChecker(BaseComponent):
11
- """
12
- This node checks the entailment between every document content and the query.
13
- It enrichs the documents metadata with entailment informations.
14
- It also returns aggregate entailment information.
15
- """
16
-
17
- outgoing_edges = 1
18
-
19
- def __init__(
20
- self,
21
- model_name_or_path: str = "roberta-large-mnli",
22
- model_version: Optional[str] = None,
23
- tokenizer: Optional[str] = None,
24
- use_gpu: bool = True,
25
- batch_size: int = 16,
26
- entailment_contradiction_threshold: float = 0.5,
27
- ):
28
- """
29
- Load a Natural Language Inference model from Transformers.
30
-
31
- :param model_name_or_path: Directory of a saved model or the name of a public model.
32
- See https://huggingface.co/models for full list of available models.
33
- :param model_version: The version of model to use from the HuggingFace model hub. Can be tag name, branch name, or commit hash.
34
- :param tokenizer: Name of the tokenizer (usually the same as model)
35
- :param use_gpu: Whether to use GPU (if available).
36
- :param batch_size: Number of Documents to be processed at a time.
37
- :param entailment_contradiction_threshold: if in the first N documents there is a strong evidence of entailment/contradiction
38
- (aggregate entailment or contradiction are greater than the threshold), the less relevant documents are not taken into account
39
- """
40
- super().__init__()
41
-
42
- self.devices, _ = initialize_device_settings(use_cuda=use_gpu, multi_gpu=False)
43
-
44
- tokenizer = tokenizer or model_name_or_path
45
- self.tokenizer = AutoTokenizer.from_pretrained(tokenizer)
46
- self.model = AutoModelForSequenceClassification.from_pretrained(
47
- pretrained_model_name_or_path=model_name_or_path, revision=model_version
48
- )
49
- self.batch_size = batch_size
50
- self.entailment_contradiction_threshold = entailment_contradiction_threshold
51
- self.model.to(str(self.devices[0]))
52
-
53
- id2label = AutoConfig.from_pretrained(model_name_or_path).id2label
54
- self.labels = [id2label[k].lower() for k in sorted(id2label)]
55
- if "entailment" not in self.labels:
56
- raise ValueError(
57
- "The model config must contain entailment value in the id2label dict."
58
- )
59
-
60
- def run(self, query: str, documents: List[Document]):
61
-
62
- scores, agg_con, agg_neu, agg_ent = 0, 0, 0, 0
63
- premise_batch = [doc.content for doc in documents]
64
- hypotesis_batch = [query] * len(documents)
65
- entailment_info_batch = self.get_entailment_batch(premise_batch=premise_batch, hypotesis_batch=hypotesis_batch)
66
- for i, (doc, entailment_info) in enumerate(zip(documents, entailment_info_batch)):
67
- doc.meta["entailment_info"] = entailment_info
68
-
69
- scores += doc.score
70
- con, neu, ent = (
71
- entailment_info["contradiction"],
72
- entailment_info["neutral"],
73
- entailment_info["entailment"],
74
- )
75
- agg_con += con * doc.score
76
- agg_neu += neu * doc.score
77
- agg_ent += ent * doc.score
78
-
79
- # if in the first documents there is a strong evidence of entailment/contradiction,
80
- # there is no need to consider less relevant documents
81
- if max(agg_con, agg_ent) / scores > self.entailment_contradiction_threshold:
82
- break
83
-
84
- aggregate_entailment_info = {
85
- "contradiction": round(agg_con / scores, 2),
86
- "neutral": round(agg_neu / scores, 2),
87
- "entailment": round(agg_ent / scores, 2),
88
- }
89
-
90
- entailment_checker_result = {
91
- "documents": documents[: i + 1],
92
- "aggregate_entailment_info": aggregate_entailment_info,
93
- }
94
-
95
- return entailment_checker_result, "output_1"
96
-
97
- def run_batch(self, queries: List[str], documents: List[Document]):
98
- entailment_checker_result_batch = []
99
- entailment_info_batch = self.get_entailment_batch(premise_batch=documents, hypotesis_batch=queries)
100
- for doc, entailment_info in zip(documents, entailment_info_batch):
101
- doc.meta["entailment_info"] = entailment_info
102
- aggregate_entailment_info = {
103
- "contradiction": round(entailment_info["contradiction"] / doc.score),
104
- "neutral": round(entailment_info["neutral"] / doc.score),
105
- "entailment": round(entailment_info["entailment"] / doc.score),
106
- }
107
- entailment_checker_result_batch.append({
108
- "documents": [doc],
109
- "aggregate_entailment_info": aggregate_entailment_info,
110
- })
111
- return entailment_checker_result_batch, "output_1"
112
-
113
-
114
- def get_entailment_dict(self, probs):
115
- entailment_dict = {k.lower(): v for k, v in zip(self.labels, probs)}
116
- return entailment_dict
117
-
118
- def get_entailment_batch(self, premise_batch: List[str], hypotesis_batch: List[str]):
119
- formatted_texts = [f"{premise}{self.tokenizer.sep_token}{hypotesis}" for premise, hypotesis in zip(premise_batch, hypotesis_batch)]
120
- with torch.inference_mode():
121
- inputs = self.tokenizer(formatted_texts, return_tensors="pt", padding=True, truncation=True).to(self.devices[0])
122
- out = self.model(**inputs)
123
- logits = out.logits
124
- probs_batch = (torch.nn.functional.softmax(logits, dim=-1).detach().cpu().numpy() )
125
- return [self.get_entailment_dict(probs) for probs in probs_batch]
126
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt CHANGED
@@ -1,4 +1,5 @@
1
- farm-haystack[faiss]==1.16.1
 
2
  plotly==5.14.1
3
 
4
  # commented to not interfere with streamlit SDK in HF spces
 
1
+ farm-haystack[faiss,inference]==1.18.1
2
+ haystack-entailment-checker
3
  plotly==5.14.1
4
 
5
  # commented to not interfere with streamlit SDK in HF spces