|
--- |
|
language: ti |
|
license: mit |
|
library_name: transformers |
|
tags: |
|
- tigrinya |
|
- gpt2 |
|
- text-generation |
|
metrics: |
|
- perplexity |
|
- loss |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Model Card for GPT-2 Tigrinya Medium |
|
|
|
## Model Summary |
|
This is a GPT-2 model trained from scratch on Tigrinya text data. It was trained on 20.6 million tokens, primarily from news sources. |
|
|
|
#### Model Description |
|
- Model type: GPT-2 |
|
- Language: Tigrinya (α΅ααα) |
|
- Finetuned from model: Trained from scratch (no pre-training) |
|
|
|
#### Model Architecture |
|
- Parameters: 51.9M |
|
- Context Window: 128 tokens |
|
- Vocabulary Size: 52,000 |
|
|
|
#### Training Details |
|
- Training regime: fp16 mixed precision |
|
- Number of Epochs: 12 |
|
- Batch Size: 6 (with gradient accumulation steps of 8) |
|
- Learning Rate: 5e-4 |
|
|
|
#### Evaluation |
|
- Training Perplexity: 28.6 |
|
- Training Loss: 3.12 |
|
|
|
#### Usage |
|
|
|
```python |
|
from transformers import pipeline |
|
# Load the model |
|
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-medium') |
|
|
|
prompt = "ααα α΅αα«α" |
|
# Generate text |
|
text = generator(prompt, max_length=100)[0]['generated_text'] |
|
print(text) |
|
``` |
|
|
|
#### Limitations |
|
- Limited context window of 128 tokens. |
|
- Best suited for medium-length Tigrinya text generation. |
|
- Outputs should be reviewed for accuracy. |