microsoft
/

deberta-v2-xlarge-mnli

Text Classification

Inference Endpoints

Model card Files Files and versions Community

DeBERTa commited on May 21, 2021

Commit

5272422

•

1 Parent(s): 26edb09

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ widget:
 ## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
-[DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa on a majority of NLU tasks with 80GB training data.
 Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
@@ -41,8 +41,8 @@ We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
 ```bash
 cd transformers/examples/text-classification/
 export TASK_NAME=mrpc
-python -m torch.distributed.launch --nproc_per_node=8 run_glue.py   --model_name_or_path microsoft/deberta-v2-xxlarge   \
---task_name $TASK_NAME   --do_train   --do_eval   --max_seq_length 128   --per_device_train_batch_size 4   \
 --learning_rate 3e-6   --num_train_epochs 3   --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
 ```

 ## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
+[DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on  majority of NLU tasks with 80GB training data.
 Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
 ```bash
 cd transformers/examples/text-classification/
 export TASK_NAME=mrpc
+python -m torch.distributed.launch --nproc_per_node=8 run_glue.py   --model_name_or_path microsoft/deberta-v2-xxlarge   \\
+--task_name $TASK_NAME   --do_train   --do_eval   --max_seq_length 128   --per_device_train_batch_size 4   \\
 --learning_rate 3e-6   --num_train_epochs 3   --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
 ```