Bert-finetuned-mrpc Fine-tuned for Sequence classification

This model is a fine-tuned version of bert-finetuned-mrpc for sequence classification tasks.

Model Description

Dataset

Name: MRPC (Microsoft Research Paraphrase Corpus)
Description: The MRPC dataset consists of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent.
Source: The dataset is part of the GLUE benchmark.

Model description

This model is a fine-tuned version of BERT-base-uncased, specifically trained to determine if two sentences are paraphrases of each other. The model outputs 1 if the sentences are equivalent and 0 if they are not.

Model architecture: BertForSequenceClassification
Task: sequence-classification
Training dataset: glue mrpc dataset
Number of parameters: 109,483,778
Sequence length: 512
Vocab size: 30522
Hidden size: 768
Number of attention heads: 12
Number of hidden layers: 12

Intended Uses & Limitations

Intended Uses

Paraphrase Detection: This model can be used to determine if two sentences are paraphrases of each other, which is useful in applications like duplicate question detection in forums, semantic search, and text summarization.
Educational Purposes: Can be used for educational purposes to demonstrate fine-tuning of transformer models on specific tasks.

Limitations

Dataset Bias: The MRPC dataset contains sentence pairs from specific news sources, which might introduce bias. The model might not perform well on text from other domains.
Context Limitations: The model evaluates sentences pairwise without considering broader context, which might lead to incorrect paraphrase detections in complex contexts.

Training procedure

Optimizer: AdamW
Learning Rate: 5e-5
Epochs: 3
Batch Size: 8

Evaluation results

{'accuracy': 0.8504901960784313, 'f1': 0.8942807625649913}