---
license: mit
datasets:
- ehovy/race
language:
- en
metrics:
- bleu
base_model:
- google-t5/t5-base
pipeline_tag: text2text-generation
library_name: transformers
tags:
- distractor-generation
- education
- mcq-questions
---
# Distractor Generation with T5-base

This repository contains a **T5-base** model fine-tuned for distractor generation. Leveraging T5’s text-to-text framework and a custom separator token, the model generates three plausible distractors for multiple-choice questions by conditioning on a given question, context, and correct answer.

## Model Overview

Built with [PyTorch Lightning](https://www.pytorchlightning.ai/), this implementation fine-tunes the pre-trained **T5-base** model to generate distractor options. The model takes a single input sequence formatted with the question, context, and correct answer—separated by a custom token—and generates a target sequence containing three distractors. This approach is particularly useful for multiple-choice question generation tasks.

## Data Processing

### Input Construction

Each input sample is a single string with the following format:

```
question {SEP_TOKEN} context {SEP_TOKEN} correct
```

- **question:** The question text.
- **context:** The context passage.
- **correct:** The correct answer.
- **SEP_TOKEN:** A special token added to the tokenizer to separate the different fields.

### Target Construction

Each target sample is constructed as follows:

```
incorrect1 {SEP_TOKEN} incorrect2 {SEP_TOKEN} incorrect3
```

This format allows the model to generate three distractors in one pass.

## Training Details

- **Framework:** PyTorch Lightning
- **Base Model:** T5-base
- **Optimizer:** Adam with linear scheduling (using a warmup scheduler)
- **Batch Size:** 32
- **Number of Epochs:** 5
- **Learning Rate:** 2e-5
- **Tokenization:**  
  - **Input:** Maximum length of 512 tokens  
  - **Target:** Maximum length of 64 tokens
- **Special Tokens:** The custom `SEP_TOKEN` is added to the tokenizer and is used to separate different parts of the input and target sequences.

## Evaluation Metrics

The model is evaluated using BLEU scores for each generated distractor. Below are the BLEU scores obtained on the test set:

| Distractor     | BLEU-1   | BLEU-2   | BLEU-3   | BLEU-4   |
|----------------|----------|----------|----------|----------|
| Distractor 1   | 29.59    | 21.55    | 17.86    | 15.75    |
| Distractor 2   | 25.21    | 16.81    | 13.00    | 10.78    |
| Distractor 3   | 23.99    | 15.78    | 12.35    | 10.52    |

These scores indicate that the model is capable of generating distractors with high n‑gram overlap compared to reference distractors.

## How to Use

You can use this model with Hugging Face's Transformers pipeline as follows:

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_name = "fares7elsadek/t5-base-distractor-generation"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

SEP_TOKEN = "<sep>" 

def generate_distractors(question, context, correct, max_length=64):
    input_text = f"{question} {SEP_TOKEN} {context} {SEP_TOKEN} {correct}"
    inputs = tokenizer([input_text], return_tensors="pt", truncation=True, padding=True)
    outputs = model.generate(
        input_ids=inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        max_length=max_length
    )
    
    decoded = tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
    distractors = [d.strip() for d in decoded.split(SEP_TOKEN)]
    return distractors

# Example usage:
question = "What is the capital of France?"
context = "France is a country in Western Europe known for its rich history and cultural heritage."
correct = "Paris"
print(generate_distractors(question, context, correct))
```