ara-extract-v1 — LoRA adapter for Qwen2.5-0.5B-Instruct

LoRA adapter trained on trilingual (Uzbek · Russian · English) government-document extraction and classification tasks. Designed to teach a small base model to emit structured JSON for contract fields, document type, language ID, named entities, and one-sentence summaries.

This adapter is the small-model half of the ARA fine-tuning pair. The QLoRA-on-7B counterpart lives at bilalsaidumarov/ara-extract-7b-qlora.

TL;DR

Metric (vs. base Qwen2.5-0.5B-Instruct, no adapter)	Base	+ LoRA	Δ
JSON validity (output parses as JSON when JSON was the target)	0.0%	83.3%	+83.3 pp
Char similarity (`difflib.SequenceMatcher` mean)	0.062	0.374	×6.0
Exact match	0.0%	10.5%	+10.5 pp

The headline number is JSON validity 0 → 83%: the adapter taught the base model the structured output format, which is the practical reason to fine-tune a 0.5B model for this workload at all. Exact match is held back by the held-out tail being heavy on language-ID examples — see Limitations below.

Intended use

Direct use — inside the ARA document-intelligence platform as the LLM backend for structured extraction over Uzbek / Russian / English government documents (contracts, invoices, memos, reports, letters). The 0.5B variant is the fast / cheap path for low-VRAM environments and CPU-only demos.

Prompt format (matches training):

### Instruction:
{instruction}

### Input:
{input}

### Response:

How to use

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

BASE = "Qwen/Qwen2.5-0.5B-Instruct"
ADAPTER = "bilalsaidumarov/ara-extract-v1"

tokenizer = AutoTokenizer.from_pretrained(BASE)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
base = AutoModelForCausalLM.from_pretrained(BASE)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()

prompt = (
    "### Instruction:\nExtract amount, signing date, and counterparty from the "
    "contract excerpt. Return a JSON object with keys amount, date, counterparty.\n\n"
    "### Input:\nAGREEMENT №ARA-2026-014 dated 14.03.2026 between Ministry of "
    "Economy and Finance and Acme Logistics LLC. Total contract value: 1,200,000 UZS.\n\n"
    "### Response:\n"
)
enc = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(**enc, max_new_tokens=128, do_sample=False,
                         pad_token_id=tokenizer.pad_token_id)
print(tokenizer.decode(out[0, enc["input_ids"].shape[1]:], skip_special_tokens=True))

Serving with vLLM

vllm serve Qwen/Qwen2.5-0.5B-Instruct \
    --enable-lora \
    --lora-modules ara-extract-v1=bilalsaidumarov/ara-extract-v1 \
    --max-loras 4

Training


Base model	`Qwen/Qwen2.5-0.5B-Instruct` (Apache-2.0)
Adapter type	LoRA (PEFT)
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`
Rank `r` / `alpha` / dropout	16 / 32 / 0.05
Optimizer	AdamW, lr 2e-4
Precision	fp16
Epochs	3
Train / eval split	76 / 19 (deterministic, holdout-frac 0.2)
Hardware	NVIDIA RTX 4060 8 GB
Wall time	~20 s
Loss curve	2.30 → 1.91 → 1.45
Adapter size on disk	~10 MB

Dataset

95 supervised examples across 15 task families, trilingual (uz · ru · en):

Family	Count
Contract field extraction (amount, date, counterparty → JSON)	25
Document classification (contract / invoice / memo / report / letter)	13
Deadline date extraction	12
Summarization (one-sentence)	12
Language identification	10
Named-entity extraction	10
Contract clause translation	7
Monetary amount listing	6

Format — one JSON object per line: {"instruction": ..., "input": ..., "output": ...}.

Evaluation

Held-out 20% (19 examples), greedy decoding, max_new_tokens=128.

Metric	Base 0.5B	+ LoRA
exact_match	0.0% (0/19)	10.5% (2/19)
char_similarity	0.062	0.374
json_valid (on 6 JSON-target rows)	0.0% (0/6)	83.3% (5/6)

Same eval harness on the 7B QLoRA sibling:

Metric	LoRA on 0.5B	QLoRA on 7B
exact_match	10.5%	26.3%
char_similarity	0.374	0.610
json_valid	83.3%	100.0%
Adapter size	~10 MB	~40 MB
Train wall time (RTX 4060)	20 s	5 min

Limitations

Small dataset (95 examples). Enough to demonstrate the technique and shift the JSON-validity rate decisively, not enough for production quality across all 15 task families. Language ID and translation in particular need more examples.
Exact-match is the weakest metric because the held-out tail is heavy on language-ID rows, which the 0.5B base struggles with after only 3 epochs.
Multilingual coverage is uneven. Uzbek has the least training data of the three; outputs in Uzbek are noisier than in Russian or English.
No hyperparameter sweep. r=16, alpha=32, lr=2e-4, 3 epochs are defaults from the LoRA paper — not tuned.
Inherits base-model risks (bias, hallucination) — use temperature ≤ 0.2 and validate JSON before downstream use.

License

Apache-2.0 (matches base model Qwen/Qwen2.5-0.5B-Instruct).

Citation

If this adapter is useful in your work, cite the ARA project:

@misc{ara2026,
  title  = {ARA — AI Resource Assistant (document-intelligence platform)},
  author = {Bilol Saidumarov},
  year   = {2026},
  url    = {https://github.com/sb-bilal-dev-2/ara}
}

Framework versions

PEFT 0.19.1
Transformers ≥ 4.46
PyTorch ≥ 2.5

Downloads last month: 8

Model tree for bilalsaidumarov/ara-extract-v1

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Adapter

(593)

this model

Evaluation results

Exact match (19 examples)
self-reported

0.105
Character similarity (mean SequenceMatcher ratio)
self-reported

0.374
JSON validity rate (6 JSON-target examples)
self-reported

0.833