Edit model card

German ELECTRA large generator

Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our paper, we outline the steps taken to train our model.

The generator is useful for performing masking experiments. If you are looking for a regular language model for embedding extraction, or downstream tasks like NER, classification or QA, please use deepset/gelectra-large.

Overview

Paper: here
Architecture: ELECTRA large (generator)
Language: German

Performance

GermEval18 Coarse: 80.70
GermEval18 Fine:   55.16
GermEval14:        88.95

See also:
deepset/gbert-base deepset/gbert-large deepset/gelectra-base deepset/gelectra-large deepset/gelectra-base-generator deepset/gelectra-large-generator

Authors

Branden Chan: branden.chan [at] deepset.ai Stefan Schweter: stefan [at] schweter.eu Timo Möller: timo.moeller [at] deepset.ai

About us

deepset logo

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our work:

Get in touch: Twitter | LinkedIn | Slack | GitHub Discussions | Website

By the way: we're hiring!

Downloads last month
19
Safetensors
Model size
51.9M params
Tensor type
I64
·
F32
·

Datasets used to train deepset/gelectra-large-generator