--- license: apache-2.0 language: - de pipeline_tag: text-generation tags: - german - deutsch - simplification - vereinfachung --- # Model Card for Model ID This model was used in our experiments in our paper: [Elaborative Simplification for German-Language Texts](https://aclanthology.org/2024.sigdial-1.3). We have uploaded this model for transparency and replicability of our experiments. If however you are interested in German text simplification in general, we recommend [our more recent model](https://huggingface.co/hiig-piai/simba_best_092024). We fine-tuned [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) with a set of ca. 2000 newspaper articles which have been simplified by the Austrian Press Agency. This model was trained with the standard and the B1 level texts. ## Model Details ### Model Description - **Developed by:** Freya Hewett, Hadi Asghari - **Model type:** simplification model, text generation - **Language(s) (NLP):** German - **License:** Apache 2.0 - **Finetuned from model:** meta-llama/Meta-Llama-3-8B-Instruct ### Model Sources - **Repository:** [GermanElabSimplification](https://github.com/fhewett/GermanElabSimplification/tree/main) - **Paper:** [Elaborative Simplification for German-Language Texts](https://aclanthology.org/2024.sigdial-1.3) ## Uses ### Direct Use This model works best for simplifying German-language newspaper articles (news items, not commentaries or editorials). It may work for other types of texts. ### Downstream Use We have fine-tuned using only newspaper articles. We have not yet performed extensive out-of-domain testing, but believe that the model's capabilities could be improved by fine-tuning on more diverse data. ## Bias, Risks, and Limitations As with most text generation models, the model sometimes produces information that is incorrect. ### Recommendations Please check manually that your output text corresponds to the input text, as factual inconsistencies may have arisen. ## How to Get Started with the Model To load the model using transformers: ``` from transformers import AutoTokenizer, AutoModelForCausalLM import torch device = "cuda" tokenizer = AutoTokenizer.from_pretrained("frhew/sigdial_ft_b1") model = AutoModelForCausalLM.from_pretrained("frhew/sigdial_ft_b1", torch_dtype=torch.float16).to(device) ``` We used the following prompt at inference to test our model: ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> Du bist ein hilfreicher Assistent und hilfst dem User, Texte besser zu verstehen.<|eot_id|><|start_header_id|>user<|end_header_id|> Kannst du bitte den folgenden Text zusammenfassen und sprachlich auf ein B1-Niveau in Deutsch vereinfachen? Schreibe maximal 5 Sätze. {input_text}<|eot_id|><|start_header_id|>assistant<|end_header_id|> ``` ## Training Details ### Training Data A sample of the data used to train our model can be found [here](https://github.com/fhewett/apa-rst/tree/main/original_texts). #### Training Hyperparameters ## Evaluation The right hand side shows the results of the manual evaluation, done on the outputs from each model for 35 texts. M.P. stands for meaning preservation, S for simplification, C for coherence, F for factuality; the score represents the percentage of *yes* answers. More details on the evaluation can be found in the paper. For all metrics, higher is better. | **Model** | **Prompt** | **Test set** | **SARI** | **FRE** | **M.P.** | **S** | **C** | **F** | **Avg.** | |--------------------|---------------------|-----------------------|------------------------------|-----------------------------|------------------------------|---------------------------|---------------------------|---------------------------|------------------------------| | Baseline | Basic | A2 | 41.2 | 59.4 | .89 | .38 | .96 | .84 | .77 | | FT-A2 | Basic | A2 | 44.0 | 70.6 | .49 | .82 | .56 | .64 | .63 | | Baseline | Basic | B1 | 42.3 | 56.8 | .85 | .4 | .9 | .9 | .76 | | FT-B1 | Basic | B1 | 42.4 | 60.0 | .75 | .55 | .6 | .75 | .66 | #### Summary ## Citation **BibTeX:** @inproceedings{hewett-etal-2024-elaborative, title = "Elaborative Simplification for {G}erman-Language Texts", author = "Hewett, Freya and Asghari, Hadi and Stede, Manfred", editor = "Kawahara, Tatsuya and Demberg, Vera and Ultes, Stefan and Inoue, Koji and Mehri, Shikib and Howcroft, David and Komatani, Kazunori", booktitle = "Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue", month = sep, year = "2024", address = "Kyoto, Japan", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.sigdial-1.3", doi = "10.18653/v1/2024.sigdial-1.3", pages = "29--39"} **APA:** Freya Hewett, Hadi Asghari, and Manfred Stede. 2024. Elaborative Simplification for German-Language Texts. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 29–39, Kyoto, Japan. Association for Computational Linguistics. ## Model Card Contact frhew