turkish bart summarization uncased

Browse files

Files changed (7) hide show

README.md +96 -3
config.json +62 -0
pytorch_model.bin +3 -0
special_tokens_map.json +1 -0
tokenizer.json +0 -0
tokenizer_config.json +1 -0
vocab.txt +0 -0

README.md CHANGED Viewed

@@ -1,3 +1,96 @@
----
-license: cc
----

+---
+tags:
+- generated_from_trainer
+datasets:
+- mlsum
+metrics:
+- rouge
+model-index:
+- name: eval-bart-turkish
+  results:
+  - task:
+      name: Summarization
+      type: summarization
+    dataset:
+      name: mlsum tu
+      type: mlsum
+      args: tu
+    metrics:
+    - name: Rouge1
+      type: rouge
+      value: 43.2049
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# eval-bart-turkish
+This model is a fine-tuned version of [bart-turkish](https://huggingface.co/bart-turkish) on the mlsum tu dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.4903
+- Rouge1: 43.2049
+- Rouge2: 30.7082
+- Rougel: 38.1981
+- Rougelsum: 39.9453
+- Gen Len: 34.5978
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
+- total_eval_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 15.0
+- mixed_precision_training: Native AMP
+- label_smoothing_factor: 0.1
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
+| 4.4304        | 1.0   | 3895  | 4.3749          | 33.2844 | 22.8262 | 29.9423 | 30.7953   | 19.7732 |
+| 3.65          | 2.0   | 7790  | 3.7414          | 33.8392 | 23.517  | 30.4871 | 31.3309   | 19.9031 |
+| 3.397         | 3.0   | 11685 | 3.5651          | 34.2335 | 23.9113 | 30.9237 | 31.7434   | 19.894  |
+| 3.2202        | 4.0   | 15580 | 3.5054          | 34.2535 | 23.9595 | 30.9811 | 31.7961   | 19.9212 |
+| 3.0827        | 5.0   | 19475 | 3.4547          | 34.5545 | 24.1991 | 31.2609 | 32.085    | 19.9195 |
+| 2.9801        | 6.0   | 23370 | 3.4328          | 34.6721 | 24.2537 | 31.372  | 32.1777   | 19.9331 |
+| 2.8689        | 7.0   | 27265 | 3.4377          | 34.6764 | 24.3314 | 31.4376 | 32.1981   | 19.9278 |
+| 2.7813        | 8.0   | 31160 | 3.4407          | 34.746  | 24.345  | 31.4511 | 32.2708   | 19.9468 |
+| 2.6848        | 9.0   | 35055 | 3.4539          | 34.7376 | 24.3224 | 31.4784 | 32.2817   | 19.9096 |
+| 2.5974        | 10.0  | 38950 | 3.4683          | 34.9174 | 24.4716 | 31.5641 | 32.4039   | 19.9384 |
+| 2.5228        | 11.0  | 42845 | 3.4903          | 34.9845 | 24.4972 | 31.6585 | 32.4753   | 19.93   |
+| 2.4633        | 12.0  | 46740 | 3.5105          | 34.8496 | 24.3559 | 31.5256 | 32.3635   | 19.9275 |
+| 2.4022        | 13.0  | 50635 | 3.5234          | 34.9109 | 24.4008 | 31.5449 | 32.4021   | 19.9374 |
+| 2.3605        | 14.0  | 54530 | 3.5306          | 34.9545 | 24.4365 | 31.6208 | 32.4711   | 19.9366 |
+| 2.3216        | 15.0  | 58425 | 3.5379          | 34.9079 | 24.4077 | 31.5734 | 32.4287   | 19.9365 |
+### Framework versions
+- Transformers 4.11.3
+- Pytorch 1.8.2+cu111
+- Datasets 1.14.0
+- Tokenizers 0.10.3

config.json ADDED Viewed

	@@ -0,0 +1,62 @@

+{
+  "_name_or_path": "bart-base",
+  "activation_dropout": 0.1,
+  "activation_function": "gelu",
+  "add_bias_logits": false,
+  "add_final_layer_norm": false,
+  "architectures": [
+    "BartForConditionalGeneration"
+  ],
+  "attention_dropout": 0.1,
+  "bos_token_id": 0,
+  "classif_dropout": 0.1,
+  "classifier_dropout": 0.0,
+  "d_model": 768,
+  "decoder_attention_heads": 12,
+  "decoder_ffn_dim": 3072,
+  "decoder_layerdrop": 0.0,
+  "decoder_layers": 6,
+  "decoder_start_token_id": 2,
+  "dropout": 0.1,
+  "early_stopping": true,
+  "encoder_attention_heads": 12,
+  "encoder_ffn_dim": 3072,
+  "encoder_layerdrop": 0.0,
+  "encoder_layers": 6,
+  "eos_token_id": 2,
+  "forced_eos_token_id": 2,
+  "gradient_checkpointing": false,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2"
+  },
+  "init_std": 0.02,
+  "is_encoder_decoder": true,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2
+  },
+  "max_position_embeddings": 1024,
+  "model_type": "bart",
+  "no_repeat_ngram_size": 3,
+  "normalize_before": false,
+  "normalize_embedding": true,
+  "num_beams": 4,
+  "num_hidden_layers": 6,
+  "pad_token_id": 1,
+  "scale_embedding": false,
+  "task_specific_params": {
+    "summarization": {
+      "length_penalty": 1.0,
+      "max_length": 128,
+      "min_length": 12,
+      "num_beams": 4
+    }
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.11.3",
+  "use_cache": true,
+  "vocab_size": 32000
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7cf1f1910ea0edc1b1d1c24684524e164efa9094f29ed2d84ec15b9340a5ff7a
+size 501802579

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"unk_token": "<unk>", "bos_token": "<s>", "eos_token": "</s>", "add_prefix_space": false, "errors": "replace", "sep_token": "</s>", "cls_token": "<s>", "pad_token": "<pad>", "mask_token": "<mask>", "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "bart-turkish", "tokenizer_class": "BartTokenizer"}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff