Instructions to use bilalsaidumarov/ara-extract-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use bilalsaidumarov/ara-extract-v1 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct") model = PeftModel.from_pretrained(base_model, "bilalsaidumarov/ara-extract-v1") - Notebooks
- Google Colab
- Kaggle
ara-extract-v1 — LoRA adapter for Qwen2.5-0.5B-Instruct
LoRA adapter trained on trilingual (Uzbek · Russian · English) government-document extraction and classification tasks. Designed to teach a small base model to emit structured JSON for contract fields, document type, language ID, named entities, and one-sentence summaries.
This adapter is the small-model half of the ARA fine-tuning pair. The QLoRA-on-7B counterpart lives at bilalsaidumarov/ara-extract-7b-qlora.
TL;DR
| Metric (vs. base Qwen2.5-0.5B-Instruct, no adapter) | Base | + LoRA | Δ |
|---|---|---|---|
| JSON validity (output parses as JSON when JSON was the target) | 0.0% | 83.3% | +83.3 pp |
Char similarity (difflib.SequenceMatcher mean) |
0.062 | 0.374 | ×6.0 |
| Exact match | 0.0% | 10.5% | +10.5 pp |
The headline number is JSON validity 0 → 83%: the adapter taught the base model the structured output format, which is the practical reason to fine-tune a 0.5B model for this workload at all. Exact match is held back by the held-out tail being heavy on language-ID examples — see Limitations below.
Intended use
Direct use — inside the ARA document-intelligence platform as the LLM backend for structured extraction over Uzbek / Russian / English government documents (contracts, invoices, memos, reports, letters). The 0.5B variant is the fast / cheap path for low-VRAM environments and CPU-only demos.
Prompt format (matches training):
### Instruction:
{instruction}
### Input:
{input}
### Response:
How to use
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
BASE = "Qwen/Qwen2.5-0.5B-Instruct"
ADAPTER = "bilalsaidumarov/ara-extract-v1"
tokenizer = AutoTokenizer.from_pretrained(BASE)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
base = AutoModelForCausalLM.from_pretrained(BASE)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()
prompt = (
"### Instruction:\nExtract amount, signing date, and counterparty from the "
"contract excerpt. Return a JSON object with keys amount, date, counterparty.\n\n"
"### Input:\nAGREEMENT â„–ARA-2026-014 dated 14.03.2026 between Ministry of "
"Economy and Finance and Acme Logistics LLC. Total contract value: 1,200,000 UZS.\n\n"
"### Response:\n"
)
enc = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
out = model.generate(**enc, max_new_tokens=128, do_sample=False,
pad_token_id=tokenizer.pad_token_id)
print(tokenizer.decode(out[0, enc["input_ids"].shape[1]:], skip_special_tokens=True))
Serving with vLLM
vllm serve Qwen/Qwen2.5-0.5B-Instruct \
--enable-lora \
--lora-modules ara-extract-v1=bilalsaidumarov/ara-extract-v1 \
--max-loras 4
Training
| Base model | Qwen/Qwen2.5-0.5B-Instruct (Apache-2.0) |
| Adapter type | LoRA (PEFT) |
| Target modules | q_proj, k_proj, v_proj, o_proj |
Rank r / alpha / dropout |
16 / 32 / 0.05 |
| Optimizer | AdamW, lr 2e-4 |
| Precision | fp16 |
| Epochs | 3 |
| Train / eval split | 76 / 19 (deterministic, holdout-frac 0.2) |
| Hardware | NVIDIA RTX 4060 8 GB |
| Wall time | ~20 s |
| Loss curve | 2.30 → 1.91 → 1.45 |
| Adapter size on disk | ~10 MB |
Dataset
95 supervised examples across 15 task families, trilingual (uz · ru · en):
| Family | Count |
|---|---|
| Contract field extraction (amount, date, counterparty → JSON) | 25 |
| Document classification (contract / invoice / memo / report / letter) | 13 |
| Deadline date extraction | 12 |
| Summarization (one-sentence) | 12 |
| Language identification | 10 |
| Named-entity extraction | 10 |
| Contract clause translation | 7 |
| Monetary amount listing | 6 |
Format — one JSON object per line: {"instruction": ..., "input": ..., "output": ...}.
Evaluation
Held-out 20% (19 examples), greedy decoding, max_new_tokens=128.
| Metric | Base 0.5B | + LoRA |
|---|---|---|
| exact_match | 0.0% (0/19) | 10.5% (2/19) |
| char_similarity | 0.062 | 0.374 |
| json_valid (on 6 JSON-target rows) | 0.0% (0/6) | 83.3% (5/6) |
Same eval harness on the 7B QLoRA sibling:
| Metric | LoRA on 0.5B | QLoRA on 7B |
|---|---|---|
| exact_match | 10.5% | 26.3% |
| char_similarity | 0.374 | 0.610 |
| json_valid | 83.3% | 100.0% |
| Adapter size | ~10 MB | ~40 MB |
| Train wall time (RTX 4060) | 20 s | 5 min |
Limitations
- Small dataset (95 examples). Enough to demonstrate the technique and shift the JSON-validity rate decisively, not enough for production quality across all 15 task families. Language ID and translation in particular need more examples.
- Exact-match is the weakest metric because the held-out tail is heavy on language-ID rows, which the 0.5B base struggles with after only 3 epochs.
- Multilingual coverage is uneven. Uzbek has the least training data of the three; outputs in Uzbek are noisier than in Russian or English.
- No hyperparameter sweep.
r=16,alpha=32,lr=2e-4, 3 epochs are defaults from the LoRA paper — not tuned. - Inherits base-model risks (bias, hallucination) — use temperature ≤ 0.2 and validate JSON before downstream use.
License
Apache-2.0 (matches base model Qwen/Qwen2.5-0.5B-Instruct).
Citation
If this adapter is useful in your work, cite the ARA project:
@misc{ara2026,
title = {ARA — AI Resource Assistant (document-intelligence platform)},
author = {Bilol Saidumarov},
year = {2026},
url = {https://github.com/sb-bilal-dev-2/ara}
}
Framework versions
- PEFT 0.19.1
- Transformers ≥ 4.46
- PyTorch ≥ 2.5
- Downloads last month
- 8
Model tree for bilalsaidumarov/ara-extract-v1
Evaluation results
- Exact match (19 examples)self-reported0.105
- Character similarity (mean SequenceMatcher ratio)self-reported0.374
- JSON validity rate (6 JSON-target examples)self-reported0.833