File size: 2,425 Bytes
6825678 319aff6 5c5c79c 319aff6 6825678 5c5c79c 6825678 319aff6 e836ed3 6dda0f3 aeafbba 5c5c79c e836ed3 5c5c79c 6dda0f3 319aff6 5c5c79c 319aff6 5c5c79c 319aff6 5c5c79c 7e64633 e2e4880 7e64633 319aff6 5c5c79c 7e64633 e2e4880 7e64633 319aff6 5c5c79c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
language: en
tags:
- deberta
- fill-mask
license: mit
pipeline_tag: text-generation
---
# DeBERTa (1.4B) fixed version
This is [**deberta-v2-xxlarge**](https://huggingface.co/microsoft/deberta-v2-xxlarge) updated to implement the `AutoModelForCausalLM` class, enabling it to generate text. This implementation is based on our paper [**"BERTs are Generative In-Context Learners"**](https://arxiv.org/abs/2406.04823).
This repository also fixes three bugs in [the original HF implementation of DeBERTa](https://huggingface.co/microsoft/deberta-v2-xxlarge):
1. We fixed the incorrect name of the output embedding weights in the checkpoint file;
2. We fixed the implementation of the enhanced mask decoder (EMD), based on [the original GitHub repository](https://github.com/microsoft/DeBERTa);
3. We clamp the positional embeddings so that they work with long sequence lengths.
## Example code
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ltg/deberta-xxlarge-fixed", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("ltg/deberta-xxlarge-fixed", trust_remote_code=True).cuda().eval()
prompt = """German: Hallo, wie geht es Ihnen heute?
English:"""
prompt = prompt.replace('\n', '\\n ')
input_ids = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).input_ids.cuda()
prediction = model.generate(
input_ids,
num_beams=4,
do_sample=False,
use_cache=None,
max_new_tokens=64,
eos_token_id=tokenizer(".\\", add_special_tokens=False).input_ids[1:]
)
prediction = prediction[0, input_ids.size(1):]
prediction = tokenizer.decode(prediction).rstrip('\\')
# Expected output: "Hello, how are you doing today?"
print(prediction)
```
## Citation
If you find DeBERTa useful for your work, please cite the following paper:
```bibtex
@misc{samuel2024berts,
title={{BERTs} are Generative In-Context Learners},
author={David Samuel},
year={2024},
eprint={2406.04823},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2406.04823}
}
```
``` bibtex
@inproceedings{he2021deberta,
title={{DeBERTa}: Decoding-enhanced {BERT} with disentangled attention},
author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=XPZIaotutsD}
}
``` |