josh-oo MiriUll commited on
Commit
cd3f9d0
1 Parent(s): 96fb662

Update README.md (#1)

Browse files

- Update README.md (099acecc9c01b7345a43e4316ff30aa25502b5e7)


Co-authored-by: Miriam Anschütz <MiriUll@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -1,3 +1,12 @@
 
 
 
 
 
 
 
 
 
1
  ```python
2
  import torch
3
  from transformers import AutoTokenizer
@@ -30,4 +39,15 @@ for key, value in test_input.items():
30
 
31
  outputs = model.generate(**test_input, num_beams=3, max_length=1024)
32
  decoder_tokenizer.batch_decode(outputs)
33
- ```
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - de
5
+ ---
6
+ # German text simplification with custom decoder
7
+ This model was initialized from an mBART model and the decoder was replaced by a GPT2 language model pre-trained for German Easy Language. For more details, visit our [Github repository](https://github.com/MiriUll/Language-Models-German-Simplification).
8
+
9
+ ## Usage
10
  ```python
11
  import torch
12
  from transformers import AutoTokenizer
 
39
 
40
  outputs = model.generate(**test_input, num_beams=3, max_length=1024)
41
  decoder_tokenizer.batch_decode(outputs)
42
+ ```
43
+
44
+ ## Citation
45
+ If you use our mode, please cite:
46
+ @misc{anschütz2023language,
47
+ &emsp; title={Language Models for German Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training},
48
+ &emsp; author={Miriam Anschütz and Joshua Oehms and Thomas Wimmer and Bartłomiej Jezierski and Georg Groh},
49
+ &emsp; year={2023},
50
+ &emsp; eprint={2305.12908},
51
+ &emsp; archivePrefix={arXiv},
52
+ &emsp; primaryClass={cs.CL}
53
+ }