microsoft
/

deberta-v3-large

Inference Endpoints

Model card Files Files and versions Community

DeBERTa commited on Nov 19, 2021

Commit

98da113

•

1 Parent(s): 8711b9e

Update README.md

Files changed (1) hide show

README.md +9 -0

README.md CHANGED Viewed

@@ -69,6 +69,15 @@ python -m torch.distributed.launch --nproc_per_node=${num_gpus} \
 If you find DeBERTa useful for your work, please cite the following paper:
 ``` latex
 @inproceedings{
 he2021deberta,
 title={DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION},

 If you find DeBERTa useful for your work, please cite the following paper:
 ``` latex
+@misc{he2021debertav3,
+      title={DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing},
+      author={Pengcheng He and Jianfeng Gao and Weizhu Chen},
+      year={2021},
+      eprint={2111.09543},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
 @inproceedings{
 he2021deberta,
 title={DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION},