luel commited on
Commit
e353eb5
·
verified ·
1 Parent(s): 85b1579
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ti
3
+ license: mit
4
+ library_name: transformers
5
+ tags:
6
+ - tigrinya
7
+ - gpt2
8
+ - text-generation
9
+ metrics:
10
+ - perplexity
11
+ - loss
12
+ pipeline_tag: text-generation
13
+ model-index:
14
+ - name: gpt2-tigrinya-medium
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ name: Text Generation
19
+ metrics:
20
+ - name: Perplexity
21
+ type: perplexity
22
+ value: 37.35
23
+ - name: Training Loss
24
+ type: loss
25
+ value: 3.03
26
+ ---
27
+
28
+ # Model Card for GPT-2 Tigrinya Medium
29
+
30
+ ## Model Summary
31
+ This is a GPT-2 model trained from scratch on Tigrinya text data. It was trained on 20 million tokens, primarily from news sources. The model is specifically designed for generating Tigrinya text using the Hugging Face Transformers library.
32
+
33
+ #### Model Description
34
+ - Model type: GPT-2
35
+ - Language: Tigrinya (ትግርኛ)
36
+ - Finetuned from model: Trained from scratch (no pre-training)
37
+
38
+ #### Model Architecture
39
+ - Parameters: 42.6M
40
+ - Context Window: 128 tokens
41
+ - **Vocabulary Size:** 52,000
42
+
43
+ #### Training Details
44
+ - Training regime: fp16 mixed precision
45
+ - Number of Epochs: 12
46
+ - Batch Size: 4 (with gradient accumulation steps of 8)
47
+ - Learning Rate: 5e-4
48
+
49
+ #### Evaluation
50
+ - Training Perplexity: 37.35
51
+ - Training Loss: 3.03
52
+
53
+ #### Usage
54
+
55
+ ```python
56
+ from transformers import pipeline
57
+ # Load the model
58
+ generator = pipeline('text-generation', model='luel/gpt2-tigrinya-medium')
59
+
60
+ prompt = "ክልል ትግራይ"
61
+ # Generate text
62
+ text = generator(prompt, max_length=100)[0]['generated_text']
63
+ print(text)
64
+ ```
65
+
66
+ #### Limitations
67
+ - Limited context window of 128 tokens.
68
+ - Best suited for medium-length Tigrinya text generation.
69
+ - Outputs should be reviewed for accuracy.