Edit model card

Pre-CoFactv3-Text-Classification

Model description

This is a Text Classification model for AAAI 2024 Workshop Paper: “Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning”

Its input are claim and evidence, and output is the predicted label, which falls into one of the categories: Support, Neutral, or Refute.

It is fine-tuned by FACTIFY5WQA dataset based on microsoft/deberta-v3-large model.

For more details, you can see our paper or GitHub.

How to use?

  1. Download the model by hugging face transformers.
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
tokenizer = AutoTokenizer.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
  1. Create a pipeline.
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
  1. Use the pipeline to predict the label.
label = classifier("Micah Richards spent an entire season at Aston Vila without playing a single game. [SEP] Despite speculation that Richards would leave Aston Villa before the transfer deadline for the 2018~19 season , he remained at the club , although he is not being considered for first team selection.")
print(label)

Dataset

We utilize the dataset FACTIFY5WQA provided by the AAAI-24 Workshop Factify 3.0.

This dataset is designed for fact verification, with the task of determining the veracity of a claim based on the given evidence.

  • claim: the statement to be verified.
  • evidence: the facts to verify the claim.
  • question: the questions generated from the claim by the 5W framework (who, what, when, where, and why).
  • claim_answer: the answers derived from the claim.
  • evidence_answer: the answers derived from the evidence.
  • label: the veracity of the claim based on the given evidence, which is one of three categories: Support, Neutral, or Refute.
Training Validation Testing Total
Support 3500 750 750 5000
Neutral 3500 750 750 5000
Refute 3500 750 750 5000
Total 10500 2250 2250 15000

Fine-tuning

Fine-tuning is conducted by the Hugging Face Trainer API on the Text Classification task.

Training hyperparameters

The following hyperparameters were used during training:

  • Pre-train language model: microsoft/deberta-v3-large
  • Optimizer: adam
  • Learning rate: 0.00001
  • Max token of input: 650
  • Batch size: 4
  • Epoch: 12
  • Device: NVIDIA RTX A5000

Testing

In the case of the Text Classification task, accuracy serves as the evaluation metric.

Accuracy
0.8502

Other models

AndyChiang/Pre-CoFactv3-Question-Answering

Citation

Downloads last month
1

Finetuned from