atzenhofer
/

distilroberta-base-mhg-charter-mlm

Middle High German (ca. 1050-1500)

Inference Endpoints

Model card Files Files and versions Community

distilroberta-base-mhg-charter-mlm / README.md

atzenhofer's picture

Update README.md

7d5e242 about 1 year ago

|

raw history blame contribute delete

No virus

3.3 kB

	---
	license: gpl-3.0
	language:
	- gmh
	- de
	widget:
	- text: >-
	Ich Ott von Zintzndorff vergich mit dem offenn prief vnd tun chunt alln den
	leutn, di in sehnt oder hornt lesn, daz ich mit wolbedachtm mut vnd mit
	guetem rat vnd willn czu der czeit, do ich ez wol getun mochtt, den erbern
	herrn vnd fursten apt Englschalchn cze Seydensteten vnd sein gnants Gotshavs
	daselbs gancz vnd gar ledig sage vnd lazze aller der ansproch, die ich ...
	han auf seiner guter ains, des Schoephls lehn auf dem Graentleinsperg gnant
	in Groestner pharr gelegn, also, daz ich vnd alle mein erbn furbaz dhain
	ansprach dar vmb habn welln noch schulln in dhainn wegn, weder wenig noch
	vil. Vnd dar vmb czu eine steten vrchund gib ich dem vorgnantn Apt
	Englchalchn vnd seim wirdign Gotshaws cze Seydenstet den prief, versigelt
	mit meim des egnantn Ottn von Czintzndorff, vnd mit hern Dytrichs des
	Schenchn von Dobra anhangunden Insigeln, der das durch meinn willn cze
	gezeug der obgeschribn sach an den prief hat gehang. Das ist geschehn vnd
	der prief ist gebn nach Christs gepurd vber Drewtzehn hundert Jar, dar nach
	im Sibn vnd fumftzgisten Jar, am Eritag in den Phingstveyrtagn.
	---

	# DistilRoBERTa (base) Middle High German Charter Masked Language Model
	This model is a fine-tuned version of distilroberta-base on Middle High German (gmh; ISO 639-2; c. 1050–1500) charters of the [monasterium.net](https://www.icar-us.eu/en/cooperation/online-portals/monasterium-net/) data set.

	## Model description
	Please refer this model together with to the [distilroberta (base-sized model)](https://huggingface.co/distilroberta-base) card or the paper [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter by Sanh et al.](https://arxiv.org/abs/1910.01108) for additional information.

	## Intended uses & limitations
	This model can be used for sequence prediction tasks, i.e., fill-masks.

	## Training and evaluation data
	The model was fine-tuned using the Middle High German Monasterium charters.
	It was trained on a NVIDIA GeForce GTX 1660 Ti 6GB GPU.

	## Training hyperparameters
	The following hyperparameters were used during training:
	- num_train_epochs: 10
	- learning_rate: 2e-5
	- weight-decay: 0,01
	- train_batch_size: 8
	- eval_batch_size: 8
	- num_proc: 4
	- block_size: 256


	## Training results

	\| Epoch \| Training Loss \| Validation Loss \|
	\|-------\|---------------\|-----------------\|
	\| 1 \| 2.537000 \| 2.112094 \|
	\| 2 \| 2.053400 \| 1.838937 \|
	\| 3 \| 1.900300 \| 1.706654 \|
	\| 4 \| 1.766200 \| 1.607970 \|
	\| 5 \| 1.669200 \| 1.532340 \|
	\| 6 \| 1.619100 \| 1.490333 \|
	\| 7 \| 1.571300 \| 1.476035 \|
	\| 8 \| 1.543100 \| 1.428958 \|
	\| 9 \| 1.517100 \| 1.423216 \|
	\| 10 \| 1.508300 \| 1.408235 \|

	Perplexity: 4.07

	## Updates
	- 2023-03-30: Upload


	## Citation
	Please cite as follows when using this model.

	```
	@misc{distilroberta-base-mhg-charter-mlm,
	title={distilroberta-base-mhg-charter-mlm},
	author={Atzenhofer-Baumgartner, Florian},
	year = { 2023 },
	url = { https://huggingface.co/atzenhofer/distilroberta-base-mhg-charter-mlm },
	publisher = { Hugging Face }
	}
	```