|
--- |
|
language: |
|
- |
|
- |
|
thumbnail: |
|
tags: |
|
- |
|
- |
|
- |
|
license: |
|
datasets: |
|
- |
|
- |
|
metrics: |
|
- |
|
- |
|
--- |
|
|
|
# GPT-2 GERMAN |
|
|
|
## Model description |
|
|
|
TODO |
|
## Intended uses & limitations |
|
|
|
#### How to use |
|
|
|
```python |
|
# You can include sample code which will be formatted |
|
``` |
|
|
|
#### Limitations and bias |
|
|
|
Provide examples of latent issues and potential remediations. |
|
|
|
## Training data |
|
|
|
https://huggingface.co/datasets/german-nlp-group/german_common_crawl |
|
|
|
```json |
|
{'url': 'http://my-shop.ru/shop/books/545473.html', |
|
'date_download': '2016-10-20T19:38:58Z', |
|
'digest': 'sha1:F62EMGYLZDIKF4UL5JZYU47KWGGUBT7T', |
|
'length': 1155, |
|
'nlines': 4, |
|
'source_domain': 'my-shop.ru', |
|
'title': 'Grammatikalische Liebeslieder. Methodische Vorschläge', |
|
'raw_content': 'Grammatikalische Liebeslieder. [....]', |
|
'cc_segment': 'crawl-data/CC-MAIN-2016-44/segments/1476988717783.68/wet/CC-MAIN-20161020183837-00354-ip-10-171-6-4.ec2.internal.warc.wet.gz', |
|
'original_nlines': 99, |
|
'original_length': 2672, |
|
'language': 'de', |
|
'language_score': 1.0, |
|
'perplexity': 283.0, |
|
'bucket': 'head'}" |
|
``` |
|
|
|
## Training procedure |
|
|
|
TODO (See training.md) |
|
|
|
## Eval results |
|
|
|
### BibTeX entry and citation info |
|
|
|
```bibtex |
|
@inproceedings{..., |
|
year={2021} |
|
} |
|
``` |
|
|