DeBERTa commited on
Commit
2b201b4
1 Parent(s): fe8e92d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -29,6 +29,19 @@ We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
29
  |**DeBERTa-XXLarge-V2-mnli**| - | - |**91.7/91.8**| - | - | - | 93.5 | - | - |- |
30
 
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ### Citation
33
 
34
  If you find DeBERTa useful for your work, please cite the following paper:
29
  |**DeBERTa-XXLarge-V2-mnli**| - | - |**91.7/91.8**| - | - | - | 93.5 | - | - |- |
30
 
31
 
32
+ ## Note
33
+
34
+ To try the **XXLarge** model with **HF transformers**, you need to specify **--sharded_ddp**
35
+
36
+ ```bash
37
+
38
+ cd transformers/examples/text-classification/
39
+
40
+ python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-xxlarge-v2 \
41
+ --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \
42
+ --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp
43
+ ```
44
+
45
  ### Citation
46
 
47
  If you find DeBERTa useful for your work, please cite the following paper: