File size: 9,542 Bytes
676ce21 4387ed1 a507ae5 676ce21 69d17c3 676ce21 69d17c3 676ce21 389d7aa bb9c748 3d6e990 bb9c748 3d6e990 bb9c748 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 bb8cbc2 3d6e990 bb8cbc2 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 2f96c13 886038e 676ce21 886038e 959b942 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 a724db9 3d6e990 a724db9 3d6e990 a724db9 3d6e990 a724db9 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 676ce21 3d6e990 49841d4 3d6e990 49841d4 3d6e990 676ce21 2f96c13 676ce21 3d6e990 959b942 676ce21 d9bb654 676ce21 3d6e990 526f0d7 3d6e990 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 |
---
library_name: transformers
tags: []
---
# `google-bert/bert-base-uncased` Fine-Tuned on SQuAD
# bert_squad
Pretrained model on context-based Question Answering using the SQuAD dataset. This model is fine-tuned from the BERT architecture for extracting answers from passages.
### Model Description
<!-- Provide a longer summary of what this model is. -->
bert_squad is a transformer-based model trained for context-based question answering tasks. It leverages the pretrained BERT architecture and adapts it for extracting precise answers given a question and a related context. This model uses the Stanford Question Answering Dataset (SQuAD), available via Hugging Face datasets, for training and fine-tuning.
The model was trained using free computational resources, demonstrating its accessibility for educational and small-scale research purposes.
Fine-tuned by: SADAT PARVEJ, RAFIFA BINTE JAHIR
Shared by: SADAT PARVEJ
Language(s) (NLP): ENGLISH
Finetuned from model: https://huggingface.co/google-bert/bert-base-uncased
## Training Objective
The model predicts the most relevant span of text in a given passage that answers a specific question. It fine-tunes BERT's ability to analyze context using supervised data from SQuAD.
### Performance Benchmarks
Training Loss: 0.477800
Validation Loss: 0.465936
Exact Match (EM): 87.568590%
## Intended Uses & Limitations
This model is designed for tasks such as:
Extractive Question Answering
Reading comprehension applications
Known Limitations:
As BERT is inherently a masked language model (MLM), its original pretraining limits its ability for generative tasks or handling queries outside the SQuAD-style question-answering setup.
The model's predictions may be biased or overly reliant on the training dataset, as SQuAD comprises structured and fact-based question-answer pairs.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
# Load the model and tokenizer
model_name = "Sadat07/bert_squad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
context = """
The person who invented light was
Thomas Edison.He was born in 1879.
"""
question = "When did Thomas Edison invent?"
inputs = tokenizer(question, context, return_tensors="pt", truncation=True, max_length=512)
input_ids = inputs["input_ids"].to(device)
attention_mask = inputs["attention_mask"].to(device)
print("Tokenized Input:", tokenizer.decode(input_ids[0]))
# Perform inference
with torch.no_grad():
outputs = model(input_ids=input_ids, attention_mask=attention_mask)
start_scores = outputs.start_logits
end_scores = outputs.end_logits
# Logits
print("Start logits:", start_scores)
print("End logits:", end_scores)
# Get start and end indices
start_idx = torch.argmax(start_scores)
end_idx = torch.argmax(end_scores) + 1
# Decode the answer
if start_idx >= end_idx:
print("Model did not predict a valid answer. Please check context and question.")
else:
answer = tokenizer.convert_tokens_to_string(
tokenizer.convert_ids_to_tokens(input_ids[0][start_idx:end_idx])
)
print(f"Question: {question}")
print(f"Answer: {answer}")
```
## Training Details
| Step | Training Loss | Validation Loss | Exact Match | Squad F1 | Start Accuracy | End Accuracy |
|------|---------------|-----------------|-------------|----------|----------------|--------------|
| 100 | 0.632200 | 0.811809 | 84.749290 | 84.749290| 0.847493 | 0.899243 |
| 200 | 0.751500 | 0.627198 | 84.768212 | 84.768212| 0.847682 | 0.899243 |
| 300 | 0.662600 | 0.557515 | 86.244087 | 86.244087| 0.862441 | 0.899243 |
| 400 | 0.600400 | 0.567693 | 86.177862 | 86.177862| 0.861779 | 0.899243 |
| 500 | 0.613200 | 0.523546 | 86.499527 | 86.499527| 0.864995 | 0.899243 |
| 600 | 0.495200 | 0.539225 | 86.565752 | 86.565752| 0.865658 | 0.899243 |
| 700 | 0.645300 | 0.552358 | 85.354778 | 85.354778| 0.853548 | 0.899243 |
| 800 | 0.499100 | 0.562317 | 86.338694 | 86.338694| 0.863387 | 0.899243 |
| 900 | 0.482800 | 0.499747 | 86.811731 | 86.811731| 0.868117 | 0.899243 |
| 1000 | 0.372800 | 0.543513 | 86.972564 | 86.972564| 0.869726 | 0.900000 |
| 1100 | 0.554000 | 0.502747 | 85.969726 | 85.969726| 0.859697 | 0.894797 |
| 1200 | 0.459800 | 0.484941 | 87.019868 | 87.019868| 0.870199 | 0.900662 |
| 1300 | 0.463600 | 0.477527 | 87.407758 | 87.407758| 0.874078 | 0.899905 |
| 1400 | 0.356800 | 0.499119 | 87.549669 | 87.549669| 0.875497 | 0.901608 |
| 1500 | 0.494200 | 0.485287 | 87.549669 | 87.549669| 0.875497 | 0.901703 |
| 1600 | 0.521100 | 0.466062 | 87.284768 | 87.284768| 0.872848 | 0.899243 |
| 1700 | 0.461200 | 0.462704 | 87.540208 | 87.540208| 0.875402 | 0.901419 |
| 1800 | 0.415700 | 0.474295 | 87.691580 | 87.691580| 0.876916 | 0.901892 |
| 1900 | 0.622900 | 0.462900 | 87.417219 | 87.417219| 0.874172 | 0.901987 |
| 2000 | 0.477800 | 0.465936 | 87.568590 | 87.568590| 0.875686 | 0.901892 |
### Training Data
The model was trained on the [SQuAD](https://huggingface.co/datasets/squad) dataset, a widely used benchmark for context-based question-answering tasks. It consists of passages from Wikipedia and corresponding questions, with human-annotated answers.
During training, the dataset was processed to extract contexts, questions, and answers, ensuring compatibility with the BERT architecture for QA. The training utilized free resources to minimize costs and focus on model efficiency.
### Training Procedure
**Training Objective**
The model was trained with the objective of performing context-based question answering using the SQuAD dataset. The fine-tuning process adapts BERT's masked language model (MLM) architecture for QA tasks by leveraging its ability to encode contextual relationships between the passage, question, and answer.
**Optimization**
The training utilized the AdamW optimizer with a linear learning rate scheduler and warm-up steps to ensure effective weight updates and prevent overfitting. The training was run for 2000 steps, with early stopping applied based on the validation loss and exact match score.
**Hardware and Resources**
Training was conducted on free resources, such as Google Colab or equivalent free GPU resources. While this limited the scale, adjustments in batch size and learning rate were optimized to make the training efficient within these constraints.
**Unique Features**
The model fine-tuning procedure emphasizes efficient learning, leveraging BERT's pre-trained knowledge while adapting it specifically to QA tasks in a resource-constrained environment.
#### Metrics
Performance was evaluated using the following metrics:
- **Exact Match (EM)**: Measures the percentage of predictions that match the ground-truth answers exactly.
- **F1 Score**: Assesses the overlap between the predicted and true answers at a token level, balancing precision and recall.
- **Start and End Accuracy**: Tracks the model鈥檚 ability to correctly identify the start and end indices of answers within the context.
### Results
The model trained on the SQuAD dataset achieved the following key performance metrics:
Exact Match (EM): Up to 87.69%
F1 Score: Up to 87.69%
Validation Loss: Reduced to 0.46
Start Accuracy: Peaked at 87.69%
End Accuracy: Peaked at 90.19%
#### Summary
The model, **bert_squad**, was fine-tuned for context-based question answering using the SQuAD dataset from Hugging Face. Key metrics include an Exact Match (EM) and F1 score of up to **87.69%**, demonstrating strong accuracy. Performance benchmarks show consistent improvement in loss and accuracy over 2000 steps, with validation loss reaching as low as **0.46**.
The training utilized free resources, leveraging BERT鈥檚 robust pretraining, although BERT鈥檚 limitation as a Masked Language Model (MLM) remains a consideration. This work highlights the potential for effective question-answering systems built on pre-existing datasets and infrastructure.
### Model Architecture and Objective
The model uses BERT, a pre-trained Transformer-based architecture, fine-tuned for context-based question answering tasks. It aims to predict answers based on the given input text and context.
### Compute Infrastructure
#### Hardware
GPU: Tesla P100, NVIDIA T4
#### Software
Framework: Hugging Face Transformers
Dataset: SQuAD (from Hugging Face)
Other tools: Python, PyTorch
**BibTeX:**
```bibtex
@misc{bert_squad_finetune,
title = {BERT Fine-tuned for SQuAD},
author = {Your Name or Team Name},
year = {2024},
url = {https://huggingface.co/your-model-repository}
}
```
## Glossary
Exact Match (EM): A metric measuring the percentage of predictions that match the ground truth exactly.
Masked Language Model (MLM): Pre-training objective for BERT, predicting masked words in input sentences.
|