Armenian_WIKI Model

Model Type: LLaMA-based model trained on the Armenian Wikipedia dataset using UnsloTh for efficient training and fine-tuning.

Overview

This repository contains the Armenian_WIKI model, fine-tuned specifically for the Armenian language. It is designed for various natural language processing (NLP) tasks, such as text generation, summarization, and dialogue systems.

The training process was optimized with:

QLoRA: For memory-efficient fine-tuning.
UnsloTh: For enhanced 4-bit quantized inference.

Features

Dataset: The model was trained on the Armenian Wikipedia subset (wikimedia/wikipedia:20231101.hy).
Quantization: Supports multiple quantization levels:
- F16: Full precision for high accuracy.
- Q4_K_M, Q5_K_M, Q8_0: Quantized formats for efficient inference.
Tokenizer: Includes a customized tokenizer for Armenian text processing.

Model Details

File	Description
`unsloth.F16.gguf`	Full-precision model (float16).
`unsloth.Q4_K_M.gguf`	4-bit quantized model for memory-efficient use.
`unsloth.Q5_K_M.gguf`	5-bit quantized model for balanced performance.
`unsloth.Q8_0.gguf`	8-bit quantized model for higher accuracy.
`tokenizer.json`	Tokenizer for text preprocessing.
`special_tokens_map.json`	Mapping for special tokens.

Installation

Clone the Repository

git clone https://huggingface.co/YanSysAI/Armenian_WIKI
cd Armenian_WIKI

Install Dependencies

Ensure you have the required libraries installed:

pip install torch transformers peft unsloth accelerate bitsandbytes datasets

Usage

Load the Model

You can load the model using the transformers library:

from transformers import AutoTokenizer
from unsloth import FastLanguageModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("YanSysAI/Armenian_WIKI", trust_remote_code=True)

# Load model
model = FastLanguageModel.from_pretrained(
    "YanSysAI/Armenian_WIKI",
    quantization_config={"bnb_4bit_compute_dtype": "float16"},  # Use 4-bit quantization
    device_map="auto"
)

# Generate text
input_text = "Հայաստանի պատմությունը սկսվում է"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Performance

Metric	Value
Language	Armenian
Dataset	Armenian Wikipedia (20231101.hy)
Training Framework	Transformers, UnsloTh, and PEFT
Quantization	Supported: Q4, Q5, Q8, F16

Fine-tuning

To fine-tune the model on your own dataset, follow the steps below:

Prepare your dataset in text format.
Tokenize the dataset using the provided tokenizer.

Use the provided qlora_finetune.py script with appropriate arguments:

torchrun --nproc_per_node=N qlora_finetune.py \
    --base_model ./llama-3.2-3b-instruct \
    --output_dir ./my_finetuned_model \
    --epochs 3 \
    --batch_size 2 \
    --accum_steps 4 \
    --learning_rate 2e-4

License

This model is released under the Apache 2.0 License. You are free to use, modify, and distribute this model, provided that you include proper attribution.

Citation

If you use this model in your research or application, please cite:

@misc{armenian_wiki_model,
  author = {YanSysAI},
  title = {Armenian_WIKI Model},
  year = {2025},
  url = {https://huggingface.co/YanSysAI/Armenian_WIKI}
}

Contact

For issues, questions, or contributions, feel free to reach out to the YanSysAI team via the Hugging Face community or GitHub.