YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Armenian_WIKI Model

Model Type: LLaMA-based model trained on the Armenian Wikipedia dataset using UnsloTh for efficient training and fine-tuning.

Overview

This repository contains the Armenian_WIKI model, fine-tuned specifically for the Armenian language. It is designed for various natural language processing (NLP) tasks, such as text generation, summarization, and dialogue systems.

The training process was optimized with:

  • QLoRA: For memory-efficient fine-tuning.
  • UnsloTh: For enhanced 4-bit quantized inference.

Features

  • Dataset: The model was trained on the Armenian Wikipedia subset (wikimedia/wikipedia:20231101.hy).
  • Quantization: Supports multiple quantization levels:
    • F16: Full precision for high accuracy.
    • Q4_K_M, Q5_K_M, Q8_0: Quantized formats for efficient inference.
  • Tokenizer: Includes a customized tokenizer for Armenian text processing.

Model Details

File Description
unsloth.F16.gguf Full-precision model (float16).
unsloth.Q4_K_M.gguf 4-bit quantized model for memory-efficient use.
unsloth.Q5_K_M.gguf 5-bit quantized model for balanced performance.
unsloth.Q8_0.gguf 8-bit quantized model for higher accuracy.
tokenizer.json Tokenizer for text preprocessing.
special_tokens_map.json Mapping for special tokens.

Installation

Clone the Repository

git clone https://huggingface.co/YanSysAI/Armenian_WIKI
cd Armenian_WIKI

Install Dependencies

Ensure you have the required libraries installed:

pip install torch transformers peft unsloth accelerate bitsandbytes datasets

Usage

Load the Model

You can load the model using the transformers library:

from transformers import AutoTokenizer
from unsloth import FastLanguageModel

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("YanSysAI/Armenian_WIKI", trust_remote_code=True)

# Load model
model = FastLanguageModel.from_pretrained(
    "YanSysAI/Armenian_WIKI",
    quantization_config={"bnb_4bit_compute_dtype": "float16"},  # Use 4-bit quantization
    device_map="auto"
)

# Generate text
input_text = "Հայաստանի պատմությունը սկսվում է"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Performance

Metric Value
Language Armenian
Dataset Armenian Wikipedia (20231101.hy)
Training Framework Transformers, UnsloTh, and PEFT
Quantization Supported: Q4, Q5, Q8, F16

Fine-tuning

To fine-tune the model on your own dataset, follow the steps below:

  1. Prepare your dataset in text format.
  2. Tokenize the dataset using the provided tokenizer.
  3. Use the provided qlora_finetune.py script with appropriate arguments:
    torchrun --nproc_per_node=N qlora_finetune.py \
        --base_model ./llama-3.2-3b-instruct \
        --output_dir ./my_finetuned_model \
        --epochs 3 \
        --batch_size 2 \
        --accum_steps 4 \
        --learning_rate 2e-4
    

License

This model is released under the Apache 2.0 License. You are free to use, modify, and distribute this model, provided that you include proper attribution.


Citation

If you use this model in your research or application, please cite:

@misc{armenian_wiki_model,
  author = {YanSysAI},
  title = {Armenian_WIKI Model},
  year = {2025},
  url = {https://huggingface.co/YanSysAI/Armenian_WIKI}
}

Contact

For issues, questions, or contributions, feel free to reach out to the YanSysAI team via the Hugging Face community or GitHub.

Downloads last month
209
GGUF
Model size
3.21B params
Architecture
llama

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.