chkla
/

parlbert-german-v1

Inference Endpoints

Model card Files Files and versions Community

parlbert-german-v1 / README.md

chkla's picture

Update README.md

e6e4311 about 1 year ago

|

raw history blame

No virus

1.48 kB

	---
	language: de
	widget:
	- text: >-
	Diese Themen gehören nicht ins [MASK].
	---

	### Welcome to ParlBERT-German!

	🏷 Model description:

	ParlBERT-German is a domain-specific language model. The model was created through a process of continuous pre-training, which involved using a generic German language model (GermanBERT) as the foundation and further enhancing it with domain-specific knowledge. We used [DeuParl](https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2889?show=full) as the domain-specific dataset for continuous pre-training, which provided ParlBERT-German with an better understanding of the language and context used in parliamentary debates. The result is a specialized language model that can be used in related scenarios.


	🤖 Model training

	During the model training process, a masked language modeling approach was used with a token masking probability of 15\%. The training was performed for a single epoch, which means that the entire dataset was passed through the model once during the training process.

	👨‍💻 Model Use

	```python
	from transformers import pipeline
	model = pipeline('fill-mask', model='parlbert-german')
	model("Diese Themen gehören nicht ins [MASK].")
	```

	⚠️ Limitations

	Models are often highly domain dependent. Therefore, the model may perform less well on different domains and text types not included in the training set.


	🐦 Twitter: [@chklamm](http://twitter.com/chklamm)