|
--- |
|
language: |
|
- en |
|
thumbnail: "url to a thumbnail used in social sharing" |
|
tags: |
|
- classification |
|
license: "mit" |
|
datasets: |
|
- SetFit/qqp |
|
models: |
|
- microsoft/deberta-v3-base |
|
metrics: |
|
- accuracy |
|
- loss |
|
widget: |
|
- text: How is the life of a math student? Could you describe your own experiences? |
|
context: Which level of preparation is enough for the exam jlpt5? |
|
example_title: "Classification" |
|
--- |
|
|
|
A fine-tuned model based on the **DeBERTaV3** model of Microsoft and fine-tuned on **Glue QQP**, which detects the linguistical similarities between two questions and whether they are similar questions or duplicates. |
|
|
|
## Model Hyperparameters |
|
|
|
```python |
|
epoch=4 |
|
per_device_train_batch_size=32 |
|
per_device_eval_batch_size=16 |
|
lr=2e-5 |
|
weight_decay=1e-2 |
|
gradient_checkpointing=True |
|
gradient_accumulation_steps=8 |
|
``` |
|
## Model Performance |
|
|
|
```JSON |
|
{"Training Loss": 0.132400, |
|
"Validation Loss": 0.217410, |
|
"Validation Accuracy": 0.917969 |
|
} |
|
``` |
|
|
|
## Model Dependencies |
|
|
|
```JSON |
|
{"Main Model": "microsoft/deberta-v3-base", |
|
"Dataset": "SetFit/qqp" |
|
} |
|
``` |
|
|
|
## Information Citation |
|
|
|
```bibtex |
|
@inproceedings{ |
|
he2021deberta, |
|
title={DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION}, |
|
author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen}, |
|
booktitle={International Conference on Learning Representations}, |
|
year={2021}, |
|
url={https://openreview.net/forum?id=XPZIaotutsD} |
|
} |
|
``` |