File size: 4,579 Bytes
b86aeb5 b734c5c ba69fdc b734c5c ba69fdc b734c5c 11ba9eb a03c18a 53a3efd c5874d9 53a3efd b734c5c f37cbb1 b734c5c cf256fa b734c5c cf256fa 4a3f032 b734c5c cf256fa b734c5c cf256fa b734c5c cf256fa b734c5c cf256fa b734c5c cf256fa b734c5c e41482d b734c5c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
---
license: cc-by-nc-sa-4.0
tags:
- generated_from_trainer
- simplification
task_categories:
- text2text-generation
task_ids:
- text-simplification
language:
- nl
datasets:
- BramVanroy/chatgpt-dutch-simplification
metrics:
- rouge
- sari
model-index:
- name: BramVanroy/ul2-small-dutch-simplification-mai-2023
results:
- task:
type: text-simplification
name: Text Simplification
dataset:
type: BramVanroy/chatgpt-dutch-simplification
name: ChatGPT Dutch Simplification
metrics:
- type: rouge
value: 40.9663
name: Eval Rouge-1
- type: rouge
value: 18.499
name: Eval Rouge-2
- type: rouge
value: 34.9342
name: Eval RougeL
- type: rouge
value: 34.9752
name: Eval RougeLsum
- type: sari
value: 52.4509
name: Eval SARI
- type: rouge
value: 39.6138
name: Test Rouge-1
- type: rouge
value: 17.1242
name: Test Rouge-2
- type: rouge
value: 35.4629
name: Test RougeL
- type: rouge
value: 35.3679
name: Test RougeLsum
- type: sari
value: 51.7538
name: Test SARI
widget:
- example_title: "Cooking"
text: "Op bepaalde tijdstippen verlang ik naar de smaakvolle culinaire creaties welke door de ambachtelijke expertise van mijn grootmoeder zijn vervaardigd."
---
# ul2-small-dutch-simplification-mai-2023
This model is intended to simplify Dutch sentences.
This model is a fine-tuned version of [yhavinga/ul2-small-dutch](https://huggingface.co/yhavinga/ul2-small-dutch) on
the [BramVanroy/chatgpt-dutch-simplification](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)
dataset.
The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
Intelligence (MAI) at KU Leuven in 2023. Charlotte is supervised by Vincent Vandeghinste and Bram Vanroy.
Dataset creation by Charlotte, model training by Bram.
## Quick links
- [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log
- [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification`
- [Parent model](https://huggingface.co/yhavinga/ul2-small-dutch): this model was finetuned on `yhavinga/ul2-small-dutch`
- [Demo](https://huggingface.co/spaces/BramVanroy/mai-simplification-nl-2023-demo): shows the "base" model in action (don't rely on the "Hosted inference API" widget on this page, it does not work very well)
## Intended uses & limitations, and dataset
The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification
but most of the dataset is limited to sentences so document-level performance is not guaranteed.
The dataset has been generated automatically (cf.
[dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been
manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its
training data. Output of the current model is therefore subject to unexpected results (as most if not all neural
networks).
Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006370158604635734
- train_batch_size: 20
- optimizer: Adafactor
- num_epochs: 37
These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the
[repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep).
### Training results
`eval` results are on the evaluation set, `predict` results are on the test set. These were achieved with
beam search (num_beams=3).
```json
{
"eval_gen_len": 21.555555555555557,
"eval_loss": 3.2290523052215576,
"eval_rouge1": 40.9663,
"eval_rouge2": 18.499,
"eval_rougeL": 34.9342,
"eval_rougeLsum": 34.9752,
"eval_sari": 52.4509,
"predict_gen_len": 21.796875,
"predict_loss": 3.063812494277954,
"predict_rouge1": 39.6138,
"predict_rouge2": 17.1242,
"predict_rougeL": 35.4629,
"predict_rougeLsum": 35.3679,
"predict_sari": 51.7538
}
```
### Framework versions
- Transformers 4.29.2
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3
|