|
--- |
|
license: mit |
|
language: |
|
- my |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
The Simbolo's Myanmar SAR GPT symbol is trained on a dataset of 1 million Burmese data and pre-trained using the GPT-2 architecture. Its purpose is to serve as a foundational pre-trained model for the Burmese language, facilitating fine-tuning for specific applications of different tasks such as creative writing, chatbot, machine translation etc. |
|
|
|
|
|
|
|
### How to use |
|
|
|
```python |
|
!pip install transformers |
|
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Simbolo-Servicio/myanmar-sar-gpt") |
|
model = AutoModelForCausalLM.from_pretrained("Simbolo-Servicio/myanmar-sar-gpt") |
|
|
|
input_text = "" |
|
input_ids = tokenizer.encode(input_text, return_tensors='pt') |
|
output = model.generate(input_ids, max_length=100) |
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |
|
|
|
### Limitations and bias |
|
|