Pre-CoFactv3-Question-Answering

Model description

This is a Question Answering model for AAAI 2024 Workshop Paper: “Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning”

Its input are question and context, and output is the answers derived from the context. It is fine-tuned by FACTIFY5WQA dataset based on microsoft/deberta-v3-large model.

For more details, you can see our paper or GitHub.

How to use?

Download the model by hugging face transformers.

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model = AutoModelForQuestionAnswering.from_pretrained("AndyChiang/Pre-CoFactv3-Question-Answering")
tokenizer = AutoTokenizer.from_pretrained("AndyChiang/Pre-CoFactv3-Question-Answering")

Create a pipeline.

QA = pipeline("question-answering", model=model, tokenizer=tokenizer)

Use the pipeline to answer the question by context.

QA_input = {
    'context': "Micah Richards spent an entire season at Aston Vila without playing a single game.",
    'question': "Who spent an entire season at aston vila without playing a single game?",
}
answer = QA(QA_input)
print(answer)

Dataset

We utilize the dataset FACTIFY5WQA provided by the AAAI-24 Workshop Factify 3.0.

This dataset is designed for fact verification, with the task of determining the veracity of a claim based on the given evidence.

claim: the statement to be verified.
evidence: the facts to verify the claim.
question: the questions generated from the claim by the 5W framework (who, what, when, where, and why).
claim_answer: the answers derived from the claim.
evidence_answer: the answers derived from the evidence.
label: the veracity of the claim based on the given evidence, which is one of three categories: Support, Neutral, or Refute.

	Training	Validation	Testing	Total
Support	3500	750	750	5000
Neutral	3500	750	750	5000
Refute	3500	750	750	5000
Total	10500	2250	2250	15000

Fine-tuning

Fine-tuning is conducted by the Hugging Face Trainer API on the Question Answering task.

Training hyperparameters

The following hyperparameters were used during training:

Pre-train language model: microsoft/deberta-v3-large
Optimizer: adam
Learning rate: 0.00001
Max length of input: 3200
Batch size: 4
Epoch: 3
Device: NVIDIA RTX A5000

Testing

We employ BLEU scores for both claim answer and evidence answer, taking the average of the two as the metric.

Claim Answer	Evidence Answer	Average
0.5248	0.3963	0.4605

Other models

AndyChiang/Pre-CoFactv3-Text-Classification

AndyChiang
/

Pre-CoFactv3-Question-Answering