RichardErkhov
/

mukayese_-_transformer-turkish-summarization-4bits

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

RichardErkhov commited on May 9

Commit

ed0f024

•

1 Parent(s): 96c9a48

uploaded readme

Files changed (1) hide show

README.md +97 -0

README.md ADDED Viewed

	@@ -0,0 +1,97 @@

+Quantization made by Richard Erkhov.
+[Github](https://github.com/RichardErkhov)
+[Discord](https://discord.gg/pvy7H8DZMG)
+[Request more models](https://github.com/RichardErkhov/quant_request)
+transformer-turkish-summarization - bnb 4bits
+- Model creator: https://huggingface.co/mukayese/
+- Original model: https://huggingface.co/mukayese/transformer-turkish-summarization/
+Original model description:
+---
+datasets:
+- mlsum
+metrics:
+- rouge
+model-index:
+- name: mukayese/transformer-turkish-summarization
+  results:
+  - task:
+      name: Summarization
+      type: summarization
+    dataset:
+      name: mlsum tu
+      type: mlsum
+      args: tu
+    metrics:
+    - name: Rouge1
+      type: rouge
+      value: 43.2049
+license: mit
+language:
+- tr
+pipeline_tag: summarization
+---
+# [Mukayese: Turkish NLP Strikes Back](https://arxiv.org/abs/2203.01215)
+## Summarization: mukayese/transformer-turkish-summarization
+_This model is uncased_, it was initialized from scratch and trained only the mlsum/tu dataset with no pre-training.
+It achieves the following results on the evaluation set:
+- Rouge1: 43.2049
+- Rouge2: 30.7082
+- Rougel: 38.1981
+- Rougelsum: 39.9453
+Check [this](https://arxiv.org/abs/2203.01215) paper for more details on the model and the dataset.
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
+- total_eval_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 15.0
+- mixed_precision_training: Native AMP
+- label_smoothing_factor: 0.1
+### Framework versions
+- Transformers 4.11.3
+- Pytorch 1.8.2+cu111
+- Datasets 1.14.0
+- Tokenizers 0.10.3
+### Citation
+```
+@misc{safaya-etal-2022-mukayese,
+    title={Mukayese: Turkish NLP Strikes Back},
+    author={Ali Safaya and Emirhan Kurtuluş and Arda Göktoğan and Deniz Yuret},
+    year={2022},
+    eprint={2203.01215},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```