yhavinga commited on
Commit
5f02f6d
β€’
1 Parent(s): 844464d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -14,9 +14,17 @@ license: apache-2.0
14
  # t5-v1.1-base-dutch-uncased
15
 
16
  A [T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) sequence to sequence model
17
- pre-trained from scratch on [cleaned Dutch πŸ‡³πŸ‡±πŸ‡§πŸ‡ͺ mC4 ](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned).
18
 
19
 
 
 
 
 
 
 
 
 
20
  * Pre-trained T5 models need to be finetuned before they can be used for downstream tasks, therefore the inference widget on the right has been turned off.
21
  * For a demo of the Dutch CNN summarization models, head over to the Hugging Face Spaces for
22
  the **[Netherformer πŸ“°](https://huggingface.co/spaces/flax-community/netherformer)** example application!
@@ -30,14 +38,6 @@ and configs, though it must be noted that this model (t5-v1.1-base-dutch-uncased
30
  ![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)
31
 
32
 
33
- This **t5-v1.1** model has **247M** parameters.
34
- It was pre-trained on the dataset
35
- `mc4_nl_cleaned` config `full` for **2** epoch(s) and a duration of **5d5h**,
36
- with a sequence length of **1024**, batch size **64** and **1014525** total steps.
37
- Pre-training evaluation loss and accuracy are **1,20** and **0,73**.
38
- After fine-tuning on 25K samples of Dutch CNN summarization, the Rouge1 score is **33.8**
39
- (note: this evaluation model was not saved).
40
-
41
  ## Tokenizer
42
 
43
  The model uses an uncased SentencePiece tokenizer configured with the `Nmt, NFKC, Replace multi-space to single-space, Lowercase` normalizers
 
14
  # t5-v1.1-base-dutch-uncased
15
 
16
  A [T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) sequence to sequence model
17
+ pre-trained from scratch on [cleaned Dutch πŸ‡³πŸ‡±πŸ‡§πŸ‡ͺ mC4](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned).
18
 
19
 
20
+ This **t5-v1.1** model has **247M** parameters.
21
+ It was pre-trained on the dataset
22
+ `mc4_nl_cleaned` config `full` for **2** epoch(s) and a duration of **5d5h**,
23
+ with a sequence length of **1024**, batch size **64** and **1014525** total steps.
24
+ Pre-training evaluation loss and accuracy are **1,20** and **0,73**.
25
+ After fine-tuning on 25K samples of Dutch CNN summarization, the Rouge1 score is **33.8**
26
+ (note: this evaluation model was not saved).
27
+
28
  * Pre-trained T5 models need to be finetuned before they can be used for downstream tasks, therefore the inference widget on the right has been turned off.
29
  * For a demo of the Dutch CNN summarization models, head over to the Hugging Face Spaces for
30
  the **[Netherformer πŸ“°](https://huggingface.co/spaces/flax-community/netherformer)** example application!
 
38
  ![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)
39
 
40
 
 
 
 
 
 
 
 
 
41
  ## Tokenizer
42
 
43
  The model uses an uncased SentencePiece tokenizer configured with the `Nmt, NFKC, Replace multi-space to single-space, Lowercase` normalizers