renatk commited on
Commit
8d7044f
0 Parent(s):

sharing...

Browse files
.gitattributes ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tflite filter=lfs diff=lfs merge=lfs -text
29
+ *.tgz filter=lfs diff=lfs merge=lfs -text
30
+ *.wasm filter=lfs diff=lfs merge=lfs -text
31
+ *.xz filter=lfs diff=lfs merge=lfs -text
32
+ *.zip filter=lfs diff=lfs merge=lfs -text
33
+ *.zst filter=lfs diff=lfs merge=lfs -text
34
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ optimizer.pt
2
+ scheduler.pt
3
+ training_args.bin
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - az
5
+ tags:
6
+ - machine-translation
7
+ - mt5
8
+ - english
9
+ - azerbaijani
10
+ license: cc-by-nc-sa-4.0
11
+ widget:
12
+ - text: >-
13
+ Artificial intelligence is already superior to human learning in numerous
14
+ domains.
15
+ - text: Learn as if you will live forever, live like you will die tomorrow.
16
+ - text: When you change your thoughts, remember to also change your world.
17
+ pipeline_tag: translation
18
+ inference:
19
+ parameters:
20
+ max_length: 128
21
+ num_return_sequences: 1
22
+ do_sample: false
23
+ datasets:
24
+ - learningmachineaz/translate_enaz_10m
25
+ ---
26
+
27
+ # Machine Translation (Maşın tərcüməsi)
28
+
29
+ This is the most advanced and accurate mT5 based model for machine translation available as for Azerbaijani language.\
30
+ The model was trained on 10 million sentences extracted from various text sources of Azerbaijan National Library.\
31
+ Quality of translation is very close to Google Translate as it was used for English translations.
32
+
33
+ ## Text above translated using this model
34
+ ```
35
+ Bu, Azərbaycan dilinə olduğu kimi, maşın tərcüməsi üçün ən qabaqcıl və dəqiq mT5 əsaslı modeldir.
36
+ Model Azərbaycan Milli Kitabxanasının müxtəlif mətn mənbələrindən çıxarılan 10 milyon cümlə üzrə təlim keçib.
37
+ Tərcümə keyfiyyəti ingilis dilinə tərcümələr üçün istifadə olunduğundan Google Tərcümə ilə çox yaxındır.
38
+ ```
39
+
40
+ ## Training
41
+
42
+ | Key point | Info |
43
+ |-------------------------|---------|
44
+ | Base model | mT5-base |
45
+ | Batch size | 16 |
46
+ | Epochs | 10 |
47
+ | Steps | 620k |
48
+ | Training Loss | 0.56 |
49
+ | Eval Loss | 0.53 |
50
+ | Training Duration | 2 days |
51
+
52
+
53
+ ## Here is an example of how you can run inference:
54
+
55
+ ```python
56
+ from transformers import MT5Tokenizer, MT5ForConditionalGeneration
57
+
58
+ model_name = 'learningmachineaz/mt5-enaz-10m'
59
+ max_length = 128
60
+
61
+ tokenizer = MT5Tokenizer.from_pretrained(model_name)
62
+ model = MT5ForConditionalGeneration.from_pretrained(model_name)
63
+
64
+ text = "Artificial intelligence is already superior to human learning in numerous domains."
65
+ input_ids = tokenizer(f'translate English to Azerbaijani: {text}', return_tensors="pt").input_ids
66
+
67
+ # OPTION 1 - SINGLE TRANSLATION
68
+ outputs = model.generate(input_ids, max_length=max_length, do_sample=False, num_return_sequences=1)
69
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
70
+
71
+ # OPTION 2 - MULTIPLE VARIATIONS
72
+ outputs = model.generate(input_ids, max_length=max_length, do_sample=True, top_k=10, num_return_sequences=3)
73
+ for i, output in enumerate(outputs):
74
+ print(tokenizer.decode(output, skip_special_tokens=True))
75
+ ```
76
+
77
+ OPTION 1 - OUTPUT:
78
+ ```
79
+ Süni intellekt artıq çoxsaylı domenlərdə insanın öyrənilməsindən üstünlük təşkil edir.
80
+ ```
81
+
82
+ OPTION 2 - OUTPUT:
83
+ ```
84
+ Artıq çoxsaylı domenlərdə süni zəka insanın öyrənilməsindən daha üstün olması şərti ilə müşahidə edilir.
85
+ Süni intellekt artıq çoxsaylı oblastlarda insanın təlimindən yüksəkdir.
86
+ Süni intellekt artıq çoxsaylı domenlərdə insan öyrənməsindən daha üstün gəlir.
87
+ ```
88
+
89
+ ## Author
90
+
91
+ Trained and evaluated by [Renat Kalimulin](https://www.linkedin.com/in/rinat-kalimulin-16853358/)
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "mt5-enaz-10m/checkpoint-620000",
3
+ "architectures": [
4
+ "MT5ForConditionalGeneration"
5
+ ],
6
+ "d_ff": 2048,
7
+ "d_kv": 64,
8
+ "d_model": 768,
9
+ "decoder_start_token_id": 0,
10
+ "dense_act_fn": "gelu_new",
11
+ "dropout_rate": 0.1,
12
+ "eos_token_id": 1,
13
+ "feed_forward_proj": "gated-gelu",
14
+ "initializer_factor": 1.0,
15
+ "is_encoder_decoder": true,
16
+ "is_gated_act": true,
17
+ "layer_norm_epsilon": 1e-06,
18
+ "model_type": "mt5",
19
+ "num_decoder_layers": 12,
20
+ "num_heads": 12,
21
+ "num_layers": 12,
22
+ "output_past": true,
23
+ "pad_token_id": 0,
24
+ "relative_attention_max_distance": 128,
25
+ "relative_attention_num_buckets": 32,
26
+ "tie_word_embeddings": false,
27
+ "tokenizer_class": "T5Tokenizer",
28
+ "torch_dtype": "float32",
29
+ "transformers_version": "4.24.0",
30
+ "use_cache": true,
31
+ "vocab_size": 250112
32
+ }
model_args.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"adafactor_beta1": null, "adafactor_clip_threshold": 1.0, "adafactor_decay_rate": -0.8, "adafactor_eps": [1e-30, 0.001], "adafactor_relative_step": false, "adafactor_scale_parameter": false, "adafactor_warmup_init": false, "adam_betas": [0.9, 0.999], "adam_epsilon": 1e-08, "best_model_dir": "outputs/best_model", "cache_dir": "cache_dir/", "config": {}, "cosine_schedule_num_cycles": 0.5, "custom_layer_parameters": [], "custom_parameter_groups": [], "dataloader_num_workers": 0, "do_lower_case": false, "dynamic_quantize": false, "early_stopping_consider_epochs": false, "early_stopping_delta": 0, "early_stopping_metric": "eval_loss", "early_stopping_metric_minimize": true, "early_stopping_patience": 3, "encoding": null, "eval_batch_size": 16, "evaluate_during_training": true, "evaluate_during_training_silent": true, "evaluate_during_training_steps": 10000, "evaluate_during_training_verbose": true, "evaluate_each_epoch": true, "fp16": false, "gradient_accumulation_steps": 1, "learning_rate": 0.001, "local_rank": -1, "logging_steps": 50, "loss_type": null, "loss_args": {}, "manual_seed": null, "max_grad_norm": 1.0, "max_seq_length": 128, "model_name": "mt5-enaz-10m/checkpoint-150000", "model_type": "mt5", "multiprocessing_chunksize": -1, "n_gpu": 1, "no_cache": false, "no_save": false, "not_saved_args": [], "num_train_epochs": 10, "optimizer": "Adafactor", "output_dir": "mt5-enaz-10m/", "overwrite_output_dir": true, "polynomial_decay_schedule_lr_end": 1e-07, "polynomial_decay_schedule_power": 1.0, "process_count": 30, "quantized_model": false, "reprocess_input_data": false, "save_best_model": true, "save_eval_checkpoints": false, "save_model_every_epoch": false, "save_optimizer_and_scheduler": true, "save_steps": 10000, "scheduler": "constant_schedule_with_warmup", "silent": false, "skip_special_tokens": true, "tensorboard_dir": null, "thread_count": null, "tokenizer_name": null, "tokenizer_type": null, "train_batch_size": 16, "train_custom_parameters_only": false, "use_cached_eval_features": false, "use_early_stopping": true, "use_hf_datasets": false, "use_multiprocessing": false, "use_multiprocessing_for_evaluation": true, "wandb_kwargs": {}, "wandb_project": "mt5-enaz-10m", "warmup_ratio": 0.06, "warmup_steps": 372189, "weight_decay": 0.0, "model_class": "T5Model", "dataset_class": null, "do_sample": false, "early_stopping": true, "evaluate_generated_text": true, "length_penalty": 2.0, "max_length": 128, "max_steps": -1, "num_beams": 1, "num_return_sequences": 1, "preprocess_inputs": true, "repetition_penalty": 1.0, "special_tokens_list": [], "top_k": null, "top_p": null, "use_multiprocessed_decoding": true}
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77b6bc923f6087b669eefb44cfb38386df1da9a6baf906a5c0718b31cdcf2141
3
+ size 2329700173
special_tokens_map.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "eos_token": "</s>",
3
+ "pad_token": "<pad>",
4
+ "unk_token": "<unk>"
5
+ }
spiece.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
3
+ size 4309802
tokenizer_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": null,
3
+ "eos_token": "</s>",
4
+ "extra_ids": 0,
5
+ "name_or_path": "mt5-enaz-10m/checkpoint-150000",
6
+ "pad_token": "<pad>",
7
+ "sp_model_kwargs": {},
8
+ "special_tokens_map_file": "/home/patrick/.cache/torch/transformers/685ac0ca8568ec593a48b61b0a3c272beee9bc194a3c7241d15dcadb5f875e53.f76030f3ec1b96a8199b2593390c610e76ca8028ef3d24680000619ffb646276",
9
+ "tokenizer_class": "T5Tokenizer",
10
+ "tokenizer_file": null,
11
+ "truncate": true,
12
+ "unk_token": "<unk>"
13
+ }
training_progress_scores.csv ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ global_step,eval_loss,train_loss
2
+ 160000,0.6764746327248831,1.1640015840530396
3
+ 170000,0.6744973361492157,0.8029671907424927
4
+ 180000,0.6619752555612534,0.7460575103759766
5
+ 190000,0.6592541152522677,0.9126784205436707
6
+ 200000,0.6558309112276349,0.6042007207870483
7
+ 210000,0.6488694297888923,0.8268142342567444
8
+ 220000,0.6416870729317741,0.9400655627250671
9
+ 230000,0.6461827348149012,0.3534563481807709
10
+ 240000,0.641677608092626,0.6845427751541138
11
+ 250000,0.6348039995110224,0.8932097554206848
12
+ 260000,0.6311305945827848,0.7806641459465027
13
+ 270000,0.6312308789245666,0.901055634021759
14
+ 280000,0.6296208884034838,0.5771832466125488
15
+ 290000,0.6248124970330132,0.7746346592903137
16
+ 300000,0.6263038397781433,0.5960983037948608
17
+ 310000,0.6215958831802247,0.6160240769386292
18
+ 320000,0.6185214041717468,0.7403829097747803
19
+ 330000,0.6207759465490069,0.5579656362533569
20
+ 340000,0.6106786254852538,1.3372493982315063
21
+ 350000,0.6066207242390466,0.4821087121963501
22
+ 360000,0.6072869395452832,0.6442691087722778
23
+ 370000,0.6044867421899524,0.5447962880134583
24
+ 380000,0.6028476896740141,0.5464233756065369
25
+ 390000,0.5947137607468499,0.7430246472358704
26
+ 400000,0.5918849138986497,0.5517004132270813
27
+ 410000,0.5858956182759906,0.6384704113006592
28
+ 420000,0.5878932698378487,0.7062010765075684
29
+ 430000,0.5743407857796502,0.7804544568061829
30
+ 440000,0.5774480019296918,0.5412857532501221
31
+ 450000,0.5740079591198574,0.4822920262813568
32
+ 460000,0.5729498740226503,0.5131783485412598
33
+ 470000,0.5671722590923309,0.6843248009681702
34
+ 480000,0.5661228710696811,0.5737451910972595
35
+ 490000,0.5619805639698392,0.7755026817321777
36
+ 500000,0.5618907236863696,0.5030483603477478
37
+ 510000,0.5603532176169138,0.5195838809013367
38
+ 520000,0.5604016530135322,0.5096762180328369
39
+ 530000,0.5546500663908701,0.5507926940917969
40
+ 540000,0.5552068097250802,0.6131696105003357
41
+ 550000,0.54720481804439,0.415729820728302
42
+ 560000,0.543909371845306,0.5612346529960632
43
+ 570000,0.5432968522821154,0.5992139577865601
44
+ 580000,0.5377468693824041,0.5478363633155823
45
+ 590000,0.5370081928041246,0.6351823210716248
46
+ 600000,0.5390658156266288,0.6527299880981445
47
+ 610000,0.5390964912043678,0.5404349565505981
48
+ 620000,0.5318419569068484,0.640168309211731
49
+ 620314,0.5290475577589066,0.8783591389656067