File size: 1,795 Bytes
63138b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7bbb120
63138b7
b8a290f
 
 
63138b7
b8a290f
 
 
 
63138b7
b8a290f
 
 
 
 
 
 
 
 
 
 
 
 
63138b7
 
b8a290f
 
63138b7
b8a290f
 
63138b7
b8a290f
 
 
 
63138b7
b8a290f
 
 
 
 
63138b7
b8a290f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
widget:
- text: አዲስ አበባ
  example_title: Example 1
- text: በኢንግሊዝ ፕሪምየር ሊግ
  example_title: Example 2
- text: ዶናልድ ትራምፕ
  example_title: Example 3
language:
- am
metrics:
- perplexity
library_name: transformers
pipeline_tag: text-generation
base_model:
- meta-llama/Llama-3.2-1B-Instruct
---

# Llama-3.2-Amharic-1B

This model is a version of Meta's [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) decoder transformer model that was continuously pretrained on an Amharic text corpus. 

- 16k new amharic tokens were added to the Llama 3.2 tokenizer and the embdedding layer of the model was resized accordingly. 
- The model was then trained on **300 million tokens** of **Amharic** text.
- This is a base model. The Amharic instruction following version is [Llama-3.2-1B-Amharic-Instruct](https://huggingface.co/rasyosef/Llama-3.2-1B-Amharic-Instruct)

### How to use
First, you need to install the latest version of transformers
```
pip install -Uq transformers
```

You can use this model directly with a pipeline for text generation:

```python
from transformers import pipeline

llama_am = pipeline(
    "text-generation",
    model="rasyosef/Llama-3.2-1B-Amharic",
    device_map="auto"
  )

prompt = "በኢንግሊዝ ፕሪምየር ሊግ"
llama_am(
    prompt,
    max_new_tokens=128,
    temperature=0.3,
    do_sample=True,
    top_k=8,
    top_p=0.8,
    repetition_penalty=1.05
  )
```

Output:
```python
[{'generated_text': 'በኢንግሊዝ ፕሪምየር ሊግ የ2017/18 የውድድር ዘመን ላይ ተሳታፊ የሆነው ሊቨርፑል ትናንት ምሽት 3 :45 ላይ ከዌስትሀም ዩናይትድ ጋር ባደረገው ጨዋታ በ2 ለ 1 ውጤት ተሸንፏል ።'}]
```