Philip May commited on
Commit
1cac1d8
1 Parent(s): 34a1402

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -26,12 +26,12 @@ The training was conducted with the following hyperparameters:
26
 
27
  - base model: [google/mt5-small](https://huggingface.co/google/mt5-small)
28
  - source_prefix: `"summarize: "`
29
- - batch size: xxx
30
  - max_source_length: 800
31
  - max_target_length: 96
32
  - warmup_ratio: 0.3
33
- - number of train epochs: xxx
34
- - gradient accumulation steps: xxx
35
 
36
  ## Datasets and Preprocessing
37
 
@@ -49,7 +49,7 @@ This model is trained on the following dataset:
49
 
50
  | Model | rouge1 | rouge2 | rougeL | rougeLsum
51
  |-------|--------|--------|--------|----------
52
- | deutsche-telekom/mt5-small-sum-de-mit-v1 (this) | xxx | xxx | xxx | xxx
53
  | [ml6team/mt5-small-german-finetune-mlsum](https://huggingface.co/ml6team/mt5-small-german-finetune-mlsum) | 18.3607 | 5.3604 | 14.5456 | 16.1946
54
  | **[deutsche-telekom/mt5-small-sum-de-en-01](https://huggingface.co/deutsche-telekom/mt5-small-sum-de-en-v1)** | **21.7336** | **7.2614** | **17.1323** | **19.3977**
55
 
 
26
 
27
  - base model: [google/mt5-small](https://huggingface.co/google/mt5-small)
28
  - source_prefix: `"summarize: "`
29
+ - batch size: 3 (6)
30
  - max_source_length: 800
31
  - max_target_length: 96
32
  - warmup_ratio: 0.3
33
+ - number of train epochs: 10
34
+ - gradient accumulation steps: 2
35
 
36
  ## Datasets and Preprocessing
37
 
 
49
 
50
  | Model | rouge1 | rouge2 | rougeL | rougeLsum
51
  |-------|--------|--------|--------|----------
52
+ | deutsche-telekom/mt5-small-sum-de-mit-v1 (this) | 16.8023 | 3.5531 | 12.6884 | 14.7624
53
  | [ml6team/mt5-small-german-finetune-mlsum](https://huggingface.co/ml6team/mt5-small-german-finetune-mlsum) | 18.3607 | 5.3604 | 14.5456 | 16.1946
54
  | **[deutsche-telekom/mt5-small-sum-de-en-01](https://huggingface.co/deutsche-telekom/mt5-small-sum-de-en-v1)** | **21.7336** | **7.2614** | **17.1323** | **19.3977**
55