Update model descripton
Browse files
README.md
CHANGED
@@ -6,6 +6,11 @@ This model contains just the `IPUConfig` files for running the DeBERTa-base mode
|
|
6 |
|
7 |
**This model contains no model weights, only an IPUConfig.**
|
8 |
|
|
|
|
|
|
|
|
|
|
|
9 |
## Usage
|
10 |
|
11 |
```
|
6 |
|
7 |
**This model contains no model weights, only an IPUConfig.**
|
8 |
|
9 |
+
## Model description
|
10 |
+
|
11 |
+
DeBERTa([Decoding-enhanced BERT with Disentangled Attention ](https://arxiv.org/abs/2006.03654 )) improves the BERT and RoBERTa models using the disentangled attention mechanism and an enhanced mask decoder which is used to replace the output softmax layer to predict the masked tokens for model pretraining.
|
12 |
+
Through two techniques, it could significantly improve the efficiency of model pre-training and performance of downstream tasks.
|
13 |
+
|
14 |
## Usage
|
15 |
|
16 |
```
|