1
---
2
language: en
3
license: apache-2.0
4
datasets:
5
- discofuse
6
---
7
8
# Roberta2Roberta_L-24_discofuse EncoderDecoder model
9
10
The model was introduced in 
11
[this paper](https://arxiv.org/abs/1907.12461) by Sascha Rothe, Shashi Narayan, Aliaksei Severyn and first released in [this repository](https://tfhub.dev/google/bertseq2seq/roberta24_discofuse/1). 
12
13
The model is an encoder-decoder model that was initialized on the `roberta-large` checkpoints for both the encoder 
14
and decoder and fine-tuned on sentencefusion on the discofuse dataset, which is linked above.
15
16
Disclaimer: The model card has been written by the Hugging Face team.
17
18
## How to use
19
20
You can use this model for sentence fusion, *e.g.*
21
22
IMPORTANT: The model was not trained on the `"` (double quotation mark) character -> so the before tokenizing the text, it is advised to replace all `"` (double quotation marks) with a single `` ` `` (single back tick).
23
24
```python
25
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
26
27
tokenizer = AutoTokenizer.from_pretrained("google/roberta2roberta_L-24_discofuse")
28
model = AutoModelForSeq2SeqLM.from_pretrained("google/roberta2roberta_L-24_discofuse")
29
30
discofuse = """As a run-blocker, Zeitler moves relatively well. Zeitler often struggles at the point of contact in space."""
31
32
input_ids = tokenizer(discofuse, return_tensors="pt").input_ids
33
output_ids = model.generate(input_ids)[0]
34
print(tokenizer.decode(output_ids, skip_special_tokens=True))
35
# should output
36
# As a run-blocker, Zeitler moves relatively well. However, Zeitler often struggles at the point of contact in space.  
37
```
38