Update README.md

4a3f032 about 1 year ago

No virus

4.58 kB

	---
	license: cc-by-nc-sa-4.0
	tags:
	- generated_from_trainer
	- simplification
	task_categories:
	- text2text-generation
	task_ids:
	- text-simplification
	language:
	- nl
	datasets:
	- BramVanroy/chatgpt-dutch-simplification
	metrics:
	- rouge
	- sari
	model-index:
	- name: BramVanroy/ul2-small-dutch-simplification-mai-2023
	results:
	- task:
	type: text-simplification
	name: Text Simplification
	dataset:
	type: BramVanroy/chatgpt-dutch-simplification
	name: ChatGPT Dutch Simplification
	metrics:
	- type: rouge
	value: 40.9663
	name: Eval Rouge-1
	- type: rouge
	value: 18.499
	name: Eval Rouge-2
	- type: rouge
	value: 34.9342
	name: Eval RougeL
	- type: rouge
	value: 34.9752
	name: Eval RougeLsum
	- type: sari
	value: 52.4509
	name: Eval SARI
	- type: rouge
	value: 39.6138
	name: Test Rouge-1
	- type: rouge
	value: 17.1242
	name: Test Rouge-2
	- type: rouge
	value: 35.4629
	name: Test RougeL
	- type: rouge
	value: 35.3679
	name: Test RougeLsum
	- type: sari
	value: 51.7538
	name: Test SARI
	widget:
	- example_title: "Cooking"
	text: "Op bepaalde tijdstippen verlang ik naar de smaakvolle culinaire creaties welke door de ambachtelijke expertise van mijn grootmoeder zijn vervaardigd."

	---


	# ul2-small-dutch-simplification-mai-2023

	This model is intended to simplify Dutch sentences.

	This model is a fine-tuned version of [yhavinga/ul2-small-dutch](https://huggingface.co/yhavinga/ul2-small-dutch) on
	the [BramVanroy/chatgpt-dutch-simplification](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)
	dataset.

	The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
	Intelligence (MAI) at KU Leuven in 2023. Charlotte is supervised by Vincent Vandeghinste and Bram Vanroy.
	Dataset creation by Charlotte, model training by Bram.

	## Quick links

	- [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log
	- [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification`
	- [Parent model](https://huggingface.co/yhavinga/ul2-small-dutch): this model was finetuned on `yhavinga/ul2-small-dutch`
	- [Demo](https://huggingface.co/spaces/BramVanroy/mai-simplification-nl-2023-demo): shows the "base" model in action (don't rely on the "Hosted inference API" widget on this page, it does not work very well)

	## Intended uses & limitations, and dataset

	The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification
	but most of the dataset is limited to sentences so document-level performance is not guaranteed.

	The dataset has been generated automatically (cf.
	[dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been
	manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its
	training data. Output of the current model is therefore subject to unexpected results (as most if not all neural
	networks).

	Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0006370158604635734
	- train_batch_size: 20
	- optimizer: Adafactor
	- num_epochs: 37

	These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the
	[repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep).

	### Training results

	`eval` results are on the evaluation set, `predict` results are on the test set. These were achieved with
	beam search (num_beams=3).

	```json
	{
	"eval_gen_len": 21.555555555555557,
	"eval_loss": 3.2290523052215576,
	"eval_rouge1": 40.9663,
	"eval_rouge2": 18.499,
	"eval_rougeL": 34.9342,
	"eval_rougeLsum": 34.9752,
	"eval_sari": 52.4509,

	"predict_gen_len": 21.796875,
	"predict_loss": 3.063812494277954,
	"predict_rouge1": 39.6138,
	"predict_rouge2": 17.1242,
	"predict_rougeL": 35.4629,
	"predict_rougeLsum": 35.3679,
	"predict_sari": 51.7538
	}
	```


	### Framework versions

	- Transformers 4.29.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.12.0
	- Tokenizers 0.13.3