Azerbaijani GPT2 Model
The model is based on the GPT-2 architecture, specifically trained on Azerbaijani text. It serves as one of the first foundational models designed to generate and understand Azerbaijani language content. Built with the autoregressive transformer decoder architecture, the model generates text token by token, predicting the next word based on the input context.
- Developed by : aLLMA Lab
- Funded by : PRODATA LLC
- Model type: Decoder-only foundational LLM
- Language: Azerbaijani
Uses
The model can be directly used for text generation, sentence completion, next token prediction tasks by providing an input prompt. Additionally, it can be fine-tuned on an Azerbaijani instruction dataset to develop an interactive question-answering model.
Training Details
context_window=1024
stride=512
lr=1e-3
warmup_steps = 10000
weight_decay=0.1,
adam_beta1 = 0.9,
adam_beta2 = 0.999
batch_size=512
max_steps=178000
- Downloads last month
- 256
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for allmalab/gpt2-aze
Base model
openai-community/gpt2