File size: 3,240 Bytes
e43e9ba e5f312c e43e9ba abf979e 1609154 abf979e e43e9ba f114e5a e43e9ba 049b4ef e43e9ba f114e5a e43e9ba f114e5a b9f448a e43e9ba f114e5a e43e9ba b9f448a e43e9ba |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
---
license: creativeml-openrail-m
language:
- my
tags:
- Myanmar
- Burmese
- GPT2
- MyanmarGPT
- Nautral Language Processing
---
# Myanmar-GPT
မြန်မာ(ဗမာ)လိုနားလည်သော GPT - Myanmar GPT
Myanmar GPT is a model trained on a private Myanmar language dataset made by MinSiThu.
The project aims to make the Myanmar language available in the GPT2 Model.
Fine-tuning the MyanmarGPT model makes it easier to build a custom Myanmar language model than using alternative language models.
Reports on training the MyanmarGPT model are visualized at [MyanmarGPT Report](https://api.wandb.ai/links/minsithu/wn8yul90).
There is also 1.42 billion parameters MyanmarGPT-Big model with multilanguage support.
You are find [MyanmarGPT-Big Here](https://huggingface.co/jojo-ai-mst/MyanmarGPT-Big).
## How to use in your project
```
!pip install transformers
```
```python
from transformers import pipeline
generator = pipeline(model="jojo-ai-mst/MyanmarGPT")
outputs = generator("အီတလီ",do_sample=False)
print(outputs)
# [{'generated_text': 'အီတလီနိုင်ငံသည် ဥရောပတိုက်၏ အမျိုးသားရေးရာ ကိစ္စများကို ရပ်ဖက်အာဏာရှိသော စီချလျက်ရှိနေခဲ့ရာ မှတ်တမ်းများပါဝင်ကြသည်။ ထိုခေတ် အခါက ရောမနိုင်ငံတော်၏ အမွေအနှစ်နေရာများတွင် ဥရောပတိုက်တွင် ဥရောပတိုက်တွင် ဥပဒေစနစ်နှစ်ခု အဖြစ် စေလွှတ်သော ပြဋ္ဌာန်းသတ်ရန် ဥပဒေစနစ်ကို ပြန်လည်ပြုစုခြင်းကို '}]
```
### alternative ways
```python
model = GPT2LMHeadModel.from_pretrained("jojo-ai-mst/MyanmarGPT")
tokenizer = GPT2Tokenizer.from_pretrained("jojo-ai-mst/MyanmarGPT")
def generate_text(prompt, max_length=300, temperature=0.8, top_k=50):
input_ids = tokenizer.encode(prompt, return_tensors="pt").cuda()
output = model.generate(
input_ids,
max_length=max_length,
temperature=temperature,
top_k=top_k,
pad_token_id=tokenizer.eos_token_id,
do_sample=True
)
for result in output:
generated_text = tokenizer.decode(result, skip_special_tokens=True)
print(generated_text)
generate_text("အီတလီ ")
```
## Here are the guidelines for using the MyanmarGPT license,
- MyanmarGPT is free to use for everyone,
- **Must Do**
- any project derived/finetuned from MyanmarGPT, used MyanmarGPT internally,
- or modified MyanmarGPT, related to MyanmarGPT **must mention the citation below** in the corresponding project's page.
- the citation
```latex
@software{MyanmarGPT,
author = {{MinSiThu}},
title = {MyanmarGPT},
version={1.1-SweptWood}
url = {https://huggingface.co/jojo-ai-mst/MyanmarGPT},
urldate = {2023-12-14}
date = {2023-12-14},
}
```
For contact, reach me via [https://www.linkedin.com/in/min-si-thu/](https://www.linkedin.com/in/min-si-thu/) |