metadata
library_name: transformers
license: mit
datasets:
- emhaihsan/quran-indonesia-tafseer-translation
language:
- id
base_model:
- Qwen/Qwen2.5-3B-Instruct
Model Card for Fine-Tuned Qwen2.5-3B-Instruct
This is a fine-tuned version of the Qwen2.5-3B-Instruct model. The fine-tuning process utilized the Quran Indonesia Tafseer Translation dataset, which provides translations and tafsir in Bahasa Indonesia for the Quran.
Model Details
Model Description
- Base Model: Qwen2.5-3B-Instruct
- Fine-Tuned By: Ellbendl Satria
- Dataset: emhaihsan/quran-indonesia-tafseer-translation
- Language: Bahasa Indonesia
- License: MIT
This model is designed for NLP tasks involving Quranic text in Bahasa Indonesia, including understanding translations and tafsir.
Uses
Direct Use
This model can be used for applications requiring the understanding, summarization, or retrieval of Quranic translations and tafsir in Bahasa Indonesia.
Downstream Use
It is suitable for fine-tuning on tasks such as:
- Quranic text summarization
- Question answering systems related to Islamic knowledge
- Educational tools for learning Quranic content in Indonesian
Biases
- The model inherits any biases present in the dataset, which is specific to Islamic translations and tafsir in Bahasa Indonesia.
Recommendations
- Users should ensure that applications using this model respect cultural and religious sensitivities.
- Results should be verified by domain experts for critical applications.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Ellbendls/Qwen-2.5-3b-Quran")
model = AutoModelForCausalLM.from_pretrained("Ellbendls/Qwen-2.5-3b-Quran")
# Move the model to GPU
model.to("cuda")
# Define the input message
messages = [
{
"role": "user",
"content": "Tafsirkan ayat ini اِهْدِنَا الصِّرَاطَ الْمُسْتَقِيْمَۙ"
}
]
# Generate the prompt using the tokenizer
prompt = tokenizer.apply_chat_template(messages, tokenize=False,
add_generation_prompt=True)
# Tokenize the prompt and move inputs to GPU
inputs = tokenizer(prompt, return_tensors='pt', padding=True,
truncation=True).to("cuda")
# Generate the output using the model
outputs = model.generate(**inputs, max_length=150,
num_return_sequences=1)
# Decode the output
text = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Print the result
print(text.split("assistant")[1])