Edit model card

Flan-T5 Large for Claim Category Classification from CHIME dataset

This model is a fine-tuned version of google/flan-t5-large for classifying whether a given claim belongs to a specific category in the CHIME paper.

Model description

The model is based on the Flan-T5 Large architecture and has been fine-tuned on a custom dataset for claim category classification. It takes a claim and a category as input and predicts whether the claim belongs to that category (1) or not (0).

Intended uses & limitations

This model is designed for binary classification of claims into categories. It can be used to determine if a given claim belongs to a specific category. The model's performance may vary depending on the domain and complexity of the claims and categories.

How to use

Here's how to use the model for prediction:

from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load the model and tokenizer
model_name = "joe32140/flan-t5-large-claim-category"
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
model = T5ForConditionalGeneration.from_pretrained(model_name)

# Prepare the input
claim = "Clodronate treatment in patients with breast cancer reduces the incidence and number of new bony and visceral metastases."
category = "Effectiveness of bone agents"
prefix = "Please answer this question: Does the claim belong to the category?"
input_text = f"{prefix} Claim: {claim} Category: {category}"

# Tokenize and generate prediction
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=8)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Convert prediction to integer (0 or 1)
result = int(prediction.strip())

print(f"Claim: {claim}")
print(f"Category: {category}")
print(f"Belongs to category: {result}")

Training data

The model was trained on a custom dataset containing claims and categories. The dataset is publicly available at CHIME: claim_category.

Training procedure

The model was fine-tuned using the following hyperparameters:

  • Learning rate: 3e-4
  • Batch size: 16
  • Number of epochs: 2
  • Training was done using the Seq2SeqTrainer from the Transformers library.

Limitations and bias

As with any machine learning model, this model may have biases present in the training data. Users should be aware of potential biases and evaluate the model's performance on their specific use case.

Downloads last month
8
Safetensors
Model size
783M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.