Turkish GPT-2 Model (Experimental)

I've made available a GPT-2 model for Turkish that I trained on a variety of texts.

The model is intended to serve as a starting point for text-specific adjustments.

Training Source

I used a Turkish corpus that is taken from different written and oral sources.

I developed a LLM model with 50k vocabulary using the Custom Tokenizers library using the training resources.

I could train the GPT-2 for Turkish using the entire training corpus (ten epochs) after developing the vocabulary.

Using the model

The model itself can be used in this way:

from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("ahmet1338/gpt-2-experimental")
model = AutoModelWithLMHead.from_pretrained("ahmet1338/gpt-2-experimental")

To generating text, we can use these lines of code which is Transformers Pipelines:

from transformers import pipeline
pipe = pipeline('text-generation', model="ahmet1338/gpt-2-experimental",
                 tokenizer="ahmet1338/gpt-2-experimental", config={'max_length':800})   
text = pipe("Akşamüstü yolda ilerlerken, ")[0]["generated_text"]
print(text)

How to clone the model repo?

git lfs install
git clone https://huggingface.co/ahmet1338/gpt-2-experimential
Downloads last month
31
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.