PhilipMay commited on
Commit
2fb4681
1 Parent(s): 485d439

initial model files

Browse files
Files changed (7) hide show
  1. LICENSE +21 -0
  2. README.md +60 -0
  3. config.json +28 -0
  4. pytorch_model.bin +3 -0
  5. special_tokens_map.json +1 -0
  6. spiece.model +3 -0
  7. tokenizer_config.json +1 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2021 Philip May, Deutsche Telekom AG
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - de
4
+
5
+ license: mit
6
+
7
+ tags:
8
+ - summarization
9
+
10
+ datasets:
11
+ - swiss_text_2019
12
+
13
+ ---
14
+
15
+ # mT5-small-sum-de-mit-v1
16
+
17
+ This is a German summarization model. It is based on the multilingual T5 model [google/mt5-small](https://huggingface.co/google/mt5-small). The special characteristic of this model is that, unlike many other models, it is licensed under a permissive open source license (MIT). Among other things, this license allows commercial use.
18
+
19
+ [![One Conversation](https://raw.githubusercontent.com/telekom/HPOflow/main/docs/source/imgs/1c-logo.png)](https://www.welove.ai/)
20
+ This model is provided by the [One Conversation](https://www.welove.ai/)
21
+ team of [Deutsche Telekom AG](https://www.telekom.com/).
22
+
23
+ ## Training
24
+
25
+ The training was conducted with the following hyperparameters:
26
+
27
+ - base model: [google/mt5-small](https://huggingface.co/google/mt5-small)
28
+ - source_prefix: `"summarize: "`
29
+ - batch size: xxx
30
+ - max_source_length: 800
31
+ - max_target_length: 96
32
+ - warmup_ratio: 0.3
33
+ - number of train epochs: xxx
34
+ - gradient accumulation steps: xxx
35
+
36
+ ## Datasets and Preprocessing
37
+
38
+ The datasets were preprocessed as follows:
39
+
40
+ The summary was tokenized with the [google/mt5-small](https://huggingface.co/google/mt5-small) tokenizer. Then only the records with no more than 94 tokens were selected.
41
+
42
+ This model is trained on the following dataset:
43
+
44
+ | Name | Language | Size | License
45
+ |------|----------|------|--------
46
+ | [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de | 84,564 | Concrete license is unclear. The data was published in the [German Text Summarization Challenge](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html). xxx
47
+
48
+ ## Evaluation on MLSUM German Test Set (no beams)
49
+
50
+ | Model | rouge1 | rouge2 | rougeL | rougeLsum
51
+ |-------|--------|--------|--------|----------
52
+ | deutsche-telekom/mt5-small-sum-de-mit-v1 (this) | xxx | xxx | xxx | xxx
53
+ | [ml6team/mt5-small-german-finetune-mlsum](https://huggingface.co/ml6team/mt5-small-german-finetune-mlsum) | 18.3607 | 5.3604 | 14.5456 | 16.1946
54
+ | **[deutsche-telekom/mt5-small-sum-de-en-01](https://huggingface.co/deutsche-telekom/mt5-small-sum-de-en-v1)** | **21.7336** | **7.2614** | **17.1323** | **19.3977**
55
+
56
+ ## License
57
+
58
+ Copyright (c) 2021 Philip May, Deutsche Telekom AG
59
+
60
+ Licensed under the MIT License (the "License"); you may not use this work except in compliance with the License. You may obtain a copy of the License by reviewing the file [LICENSE](https://huggingface.co/deutsche-telekom/mt5-small-sum-de-mit-v1/blob/main/LICENSE) in the repository.
config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "google/mt5-small",
3
+ "architectures": [
4
+ "MT5ForConditionalGeneration"
5
+ ],
6
+ "d_ff": 1024,
7
+ "d_kv": 64,
8
+ "d_model": 512,
9
+ "decoder_start_token_id": 0,
10
+ "dropout_rate": 0.1,
11
+ "eos_token_id": 1,
12
+ "feed_forward_proj": "gated-gelu",
13
+ "initializer_factor": 1.0,
14
+ "is_encoder_decoder": true,
15
+ "layer_norm_epsilon": 1e-06,
16
+ "model_type": "mt5",
17
+ "num_decoder_layers": 8,
18
+ "num_heads": 6,
19
+ "num_layers": 8,
20
+ "pad_token_id": 0,
21
+ "relative_attention_num_buckets": 32,
22
+ "tie_word_embeddings": false,
23
+ "tokenizer_class": "T5Tokenizer",
24
+ "torch_dtype": "float32",
25
+ "transformers_version": "4.9.1",
26
+ "use_cache": true,
27
+ "vocab_size": 250100
28
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:616ca42b48507a819df3399dcc5aff07f80e76c8a982f6cb3ab7b28208276130
3
+ size 1200721733
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>"}
spiece.model ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
3
+ size 4309802
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>", "extra_ids": 0, "additional_special_tokens": null, "special_tokens_map_file": "/home/phmay/.cache/huggingface/transformers/685ac0ca8568ec593a48b61b0a3c272beee9bc194a3c7241d15dcadb5f875e53.f76030f3ec1b96a8199b2593390c610e76ca8028ef3d24680000619ffb646276", "name_or_path": "google/mt5-small", "sp_model_kwargs": {}, "tokenizer_class": "T5Tokenizer"}