--- language: de license: mit inference: false tags: - gptj - title generation - headline generation - teaser generation - news --- # GPT-J-Title-Teaser-1k gptj-title-teaser-1k Version 1.0 / 22 December 2022 A proof of concept for multitask fine-tuning [GPT-J-6B-8bit](https://huggingface.co/hivemind/gpt-j-6B-8bit) for german news title and teaser generation. # Model Details ## Model Description - **Developed by:** snipaid - **Model type:** gptj - **Language(s) (NLP):** de - **License:** MIT - **Finetuned from model:** [GPT-J-6B-8bit](https://huggingface.co/hivemind/gpt-j-6B-8bit) # Uses This model is not intended for use! It is a preliminary version of gptj-title-teaser-10k to prove the multitask fine-tuning approach. For use please refer to [gptj-title-teaser-10k](https://huggingface.co/snipaid/gptj-title-teaser-10k). # Training Details ## Training Data The model was finetuned on a collection of 1,000 news items scraped from different online news outlets in german language. For each news item the dataset contains title, teaser and fulltext. ``` [ { "title": ..., "teaser": ..., "fulltext": ... }, ] ``` ## Training Procedure The model was finetuned using a causal language modeling (CLM) objective for multitask finetuning. ### Preprocessing For each news item, two inputs were concatenated like below. ``` f"[Text]: {item.fulltext} \n [Title]: {item.title}" f"[Text]: {item.fulltext} \n [Teaser]: {item.teaser}" ``` This results in one input per task for each news item. *Note: The inserted prompt "[Text]:" marks the beginning of the news item's fulltext. In the same manner "[Title]:" prompts the news item's title and "[Teaser]:" the news item's teaser.* # Evaluation 1,000 german news articles proved to be sufficient to validate the approach. Evaluation showed that the model improved compared to the GPT-J baseline in: - german language capabilities (significantly) - title generation (significantly) - teaser generation (slightly) The evaluation also suggested that there is still opportunity for improvement with more data. For the model trained with the same approach but 10x the amount of data pleaser refer to [gptj-title-teaser-10k](https://huggingface.co/snipaid/gptj-title-teaser-10k). # Environmental Impact Carbon emissions were estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** A100 SXM4 - **Hours used:** 2h 42min - **Cloud Provider:** Vast.ai - **Compute Region:** Unknown - **Carbon Emitted:** ~0.47kg co2e # Glossary **News Item**, aka news article. A particular piece of news, usually from a journalistic source. **Snippet**, a small section of text that is related to a news item. **Title** aka headline. A few words that reflect the essence of the news story. **Teaser** aka lede. A few sentences that spark curiosity about the "best of the rest" of the news story.