YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

T5-Base Fine-Tuned Model for Question Answering

This repository hosts a fine-tuned version of the T5-Base model optimized for question-answering tasks using the [SQuAD] dataset. The model is designed to efficiently perform question answering while maintaining high accuracy.

Model Details

  • Model Architecture:t5-qa-chatbot
  • Task: Question Answering (QA-Chatbot)
  • Dataset: [SQuAD]
  • Quantization: FP16
  • Fine-tuning Framework: Hugging Face Transformers

πŸš€ Usage

Installation

pip install transformers torch

Loading the Model

from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_name = "AventIQ-AI/t5-qa-chatbot"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name).to(device)

Chatbot Inference

def answer_question(question, context):
    input_text = f"question: {question} context: {context}"
    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding="max_length", max_length=512)
    
    # Move input tensors to the same device as the model
    inputs = {key: value.to(device) for key, value in inputs.items()}  
    
    # Generate answer
    with torch.no_grad():
        output = model.generate(**inputs, max_length=150)

    # Decode and return answer
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Test Case
question = "What is overfitting in machine learning?"
context = "Overfitting occurs when a model learns the training data too well, capturing noise instead of actual patterns.
predicted_answer = answer_question(question, context)
print(f"Predicted Answer: {predicted_answer}")

⚑ Quantization Details

Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy.

πŸ“‚ Repository Structure

.
β”œβ”€β”€ model/               # Contains the quantized model files
β”œβ”€β”€ tokenizer_config/    # Tokenizer configuration and vocabulary files
β”œβ”€β”€ model.safetensors/   # Quantized Model
β”œβ”€β”€ README.md            # Model documentation

⚠️ Limitations

  • The model may struggle with highly ambiguous sentences.
  • Quantization may lead to slight degradation in accuracy compared to full-precision models.
  • Performance may vary across different writing styles and sentence structures.

🀝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.

Downloads last month
96
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.