File size: 2,171 Bytes
234258c
 
9dc2cce
 
0a9ec07
 
e423a89
 
 
b8d8332
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
license: unknown
language:
- si
widget:
- text: "ඒ දවස්වල හරියට පෑවිල්ල නිතර නිතර වතුර <mask>"
  example_title: "උදාහරණ 1"
---

## Model Description

The Sinhala Story Generation Model is based on fine-tuning the XLM-RoBERTa base model on a dataset of Sinhala language stories. It is designed to generate coherent and contextually relevant Sinhala text based on story beginnings.

## Intended Use

The model is intended for generating creative Sinhala stories or text based on initial prompts. It can be used in applications requiring automated generation of Sinhala text, such as chatbots, content generation, or educational tools.

## Example Use Cases

- Creative Writing: Generate new story ideas or expand on existing story prompts.
- Language Learning: Create exercises or content in Sinhala for language learners.
- Content Generation: Automatically generate text for social media posts, blogs, or websites.

## Limitations and Ethical Considerations

- The model's output is based on patterns in the training data and may not always generate accurate or contextually appropriate text.
- Users are advised to review and refine generated text for accuracy and appropriateness before use in sensitive or critical applications.

## Model Details

- Model Architecture: XLM-RoBERTa base
- Training Data: Sinhala language stories dataset. Dataset is created using various sources such as social media and web content.
- Tokenization: AutoTokenizer from Hugging Face Transformers library
- Fine-tuning: Fine-tuned on Sinhala story dataset for text generation task

## Example Inference

To use the model for inference via the Hugging Face Inference API, consider the following example Python code:

```from transformers import pipeline

model_name = "your-username/model-name"
generator = pipeline("text-generation", model=model_name, tokenizer=model_name)

input_text = "අද සුන්දර දවසක්. හෙට ගැන සිතමින් මම පාර <mask>"
output = generator(input_text, max_length=150, num_return_sequences=1)

print(output[0]['generated_text'])```