File size: 2,131 Bytes
7c4bf58
 
ca46a9c
 
 
 
 
 
 
 
 
 
7c4bf58
ca46a9c
 
 
899bc02
ca46a9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169f7b8
 
ca46a9c
 
 
 
 
9742307
 
 
 
 
 
 
 
ca46a9c
5d50131
 
 
 
8333c2f
ca46a9c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
license: mit
datasets:
- uonlp/CulturaX
language:
- tr
pipeline_tag: text-generation

tags:
- Turkish
- turkish
- gpt2
---

# turkish-gpt2-medium

This is a Turkish GPT-2-medium model. GPT-2 is designed for text generation tasks, providing the ability to continue a given text snippet in a coherent and contextually relevant manner.
Due to the diverse nature of the training data, which includes websites, books, and other text sources, this model can exhibit biases. Users should be aware of these biases and use the model responsibly.

## Example Usage
```python
from transformers import AutoTokenizer, GPT2LMHeadModel
from transformers import pipeline

model = GPT2LMHeadModel.from_pretrained("ytu-ce-cosmos/turkish-gpt2-medium")
tokenizer = AutoTokenizer.from_pretrained("ytu-ce-cosmos/turkish-gpt2-medium")

text_generator = pipeline('text-generation', model=model, tokenizer=tokenizer)
r = text_generator("Teknolojinin gelişimi hayatımızı önemli ölçüde etkiledi. ", max_length=100)
[{'generated_text': 'Teknolojinin gelişimi hayatımızı önemli ölçüde etkiledi. "Teknoloji hayatın merkezindeyse, insan hayatında da önemli bir yere sahip demektir!" diyerek devam edelim.}]

```

Relevant information can be found in the [paper](https://arxiv.org/abs/2404.17336).

# Acknowledgments
- Research supported with Cloud TPUs from [Google's TensorFlow Research Cloud](https://sites.research.google/trc/about/) (TFRC). Thanks for providing access to the TFRC ❤️
- Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗

# Citation
```bibtex
@article{kesgin2024introducing,
  title={Introducing cosmosGPT: Monolingual Training for Turkish Language Models},
  author={Kesgin, H Toprak and Yuce, M Kaan and Dogan, Eren and Uzun, M Egemen and Uz, Atahan and Seyrek, H Emre and Zeer, Ahmed and Amasyali, M Fatih},
  journal={arXiv preprint arXiv:2404.17336},
  year={2024}
}
```

### Contact 
COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department   <br>
https://cosmos.yildiz.edu.tr/ <br>
cosmos@yildiz.edu.tr <br>