rushilJariwala commited on
Commit
81b39cb
1 Parent(s): 8445e07

update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -3
README.md CHANGED
@@ -1,3 +1,92 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - accuracy
7
+ pipeline_tag: text-classification
8
+ tags:
9
+ - code
10
+ ---
11
+
12
+ # Model Card for Bert-base-cased Paraphrase Classification
13
+
14
+ ## Model Details
15
+
16
+ ### Model Description
17
+
18
+ The **bert-base-cased-paraphrase-classification** model is a fine-tuned version of the BERT (Bidirectional Encoder Representations from Transformers) architecture specifically designed for paraphrase classification. It uses the cased variant of BERT as the base model. This model has been fine-tuned for identifying whether two input sentences are paraphrases of each other.
19
+
20
+ - **Developed by:** Rushil Jariwala
21
+ - **Model type:** Transformer-based neural network
22
+ - **Language(s) (NLP):** English
23
+ - **License:** Apache 2.0
24
+ - **Finetuned from model:** BERT-base-cased
25
+
26
+ ### Model Sources
27
+
28
+ - **Repository:** [Hugging Face Model Hub](https://huggingface.co/rushilJariwala/bert-base-cased-paraphrase-classification)
29
+
30
+ ## Uses
31
+
32
+ ### Direct Use
33
+
34
+ This model can directly classify whether two sentences are paraphrases of each other.
35
+
36
+ ### Downstream Use
37
+
38
+ When fine-tuned on a specific task or integrated into a larger application, this model can assist in tasks requiring paraphrase identification.
39
+
40
+ ### Out-of-Scope Use
41
+
42
+ This model may not perform optimally on sentences with highly domain-specific vocabulary not seen during training, and it is limited to the English language.
43
+
44
+ ## Bias, Risks, and Limitations
45
+
46
+ This model's performance may vary based on the similarity of sentences to those in the training data. It may exhibit biases based on the dataset used for training.
47
+
48
+ ### Recommendations
49
+
50
+ Users should consider domain-specific fine-tuning for optimal performance in specific applications. Additionally, careful evaluation and validation are recommended for critical applications.
51
+
52
+ ## How to Get Started with the Model
53
+
54
+ Use the following Python code to get started with the model:
55
+
56
+ ```python
57
+ from transformers import pipeline
58
+
59
+ pipe = pipeline("text-classification", model="rushilJariwala/bert-base-cased-paraphrase-classification")
60
+
61
+ sequences = [
62
+ "I've been waiting for a HuggingFace course my whole life.",
63
+ "This course is amazing!",
64
+ ]
65
+
66
+ result = pipe(sequences)
67
+ print(result)
68
+
69
+
70
+ #### Preprocessing
71
+ The text was tokenized using BERT's cased tokenizer with truncation and padding.
72
+
73
+ #### Training Hyperparameters
74
+
75
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
76
+ - Batch Size: 8
77
+ - Learning Rate: 5e-5
78
+ - Optimizer: AdamW
79
+ - Number of Epochs: 3
80
+
81
+ #### Testing Data
82
+
83
+ The model was evaluated on the MRPC validation set.
84
+ #### Metrics
85
+
86
+ Accuracy: 86.27%
87
+
88
+ #### Summary
89
+
90
+ The model achieved an accuracy of 86.27% on the MRPC validation set.
91
+
92
+