Instructions to use surogate/Qwen3.5-2B-Libra-YTD with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use surogate/Qwen3.5-2B-Libra-YTD with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="surogate/Qwen3.5-2B-Libra-YTD") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("surogate/Qwen3.5-2B-Libra-YTD") model = AutoModelForMultimodalLM.from_pretrained("surogate/Qwen3.5-2B-Libra-YTD") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use surogate/Qwen3.5-2B-Libra-YTD with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "surogate/Qwen3.5-2B-Libra-YTD" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "surogate/Qwen3.5-2B-Libra-YTD", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/surogate/Qwen3.5-2B-Libra-YTD
- SGLang
How to use surogate/Qwen3.5-2B-Libra-YTD with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "surogate/Qwen3.5-2B-Libra-YTD" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "surogate/Qwen3.5-2B-Libra-YTD", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "surogate/Qwen3.5-2B-Libra-YTD" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "surogate/Qwen3.5-2B-Libra-YTD", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use surogate/Qwen3.5-2B-Libra-YTD with Docker Model Runner:
docker model run hf.co/surogate/Qwen3.5-2B-Libra-YTD
- Qwen3.5-2B-Libra-YTD
- Business use case
- Why we built a dedicated SML for this
- How the training data makes it generalize
- Eval results
- Worked examples (real production fișe)
- Pattern A: Uzina Brașov, full year, paired suffix headers
- Pattern B: Andisol, full year, named-and-empty header pair
- Pattern C: Moise, headerless Caz 2
- Pattern D: Bughero, layout with
Total rulajeinstead ofRulaje perioada - Pattern E: Skyline, partner-type column present
- Pattern F: Algorithm, single month (Caz 1)
- Not applicable: non-receivable/payable account
- Decision rules (what the model learned)
- Output schema
- Quick start
- Training details
- Limitations
- License
- Citation
- Business use case
Qwen3.5-2B-Libra-YTD
Qwen3.5-2B fine-tuned to read Romanian fișe analitice (analytical accounting ledgers) and emit Year-to-Date extraction recipes as structured JSON. LoRA SFT, then merged back into a single 2B checkpoint for drop-in inference.
Trained on surogate/ytd-dataset.
Business use case
Romanian SMBs report VAT and trial balances monthly. Accounting software (Saga, Mentor, Smartbill, ContaPlus, custom Excel exports) emits the same trial-balance data in a dozen incompatible layouts. A bookkeeper opening any of them needs the YTD figure per partner: how much was invoiced or paid cumulative since January.
That figure is rarely a single column. Depending on the report:
- It is a direct read from
Rulaje perioada,Rulaj curent, orTotal rulaje, but only when the period covers the full year. - It is a subtraction
Sume totale - Sume precedentewhen the only column shown bundles in the opening balance. - It is a sum
Perioada precedentă + Perioadawhen the report breaks the year into chunks. - The side depends on the account root: 411x goes on Debit, 401x goes on Credit, anything else returns
not_applicable. - A small list of trap columns (
Sume totale,Total sume,Rulaj+SI,Sume precedente) looks correct but includes the opening balance and silently produces wrong numbers.
This model reads the raw extracted text (PDF copy, Excel paste, OCR, CSV, headerless, fixed-width, markdown table, multi-line header, mixed-language) and emits a JSON recipe naming exactly which column to read and which formula to apply. A deterministic post-processor consumes the recipe and produces the actual YTD figure. The model handles layout variance, the post-processor handles arithmetic.
Why we built a dedicated SML for this
Production users running GPT-4-class general models on Romanian fișe were hitting the same failure modes over and over:
| where big models fail | what they output | why it matters |
|---|---|---|
Picks Sume totale as YTD source |
wrong number including opening balance | bookkeeper reconciles to a value the partner does not recognize |
| Returns named column headers when the source has empty cells | "header": "Rulaj curent" for a blank header |
downstream column matching breaks |
| Misreads Caz 1 (single month) as Caz 2 (full year) | direct read instead of subtraction | YTD off by the opening balance amount |
| Fabricates header names on headerless layouts | invented column names | indexed column lookup returns nothing |
| Switches Debit and Credit on 401x accounts | side flipped | sign of every value flips downstream |
Treats Sume precedente as a YTD column |
uses prior-period total as current YTD | values out of step with the actual report period |
Bigger general-purpose models did not solve this because the problem is not a knowledge problem. The information to disambiguate is sitting in the input every time. The problem is that those models were never trained to attend to it. A 2B model that has seen 3620 rule-teaching examples in 12 surface formats outperforms general 70B+ models on this task at a fraction of the inference cost.
How the training data makes it generalize
The same scenario is rendered through 12 surface formats per sample. 50% of training inputs have helper section markers (PERIOADĂ DETECTATĂ:, CONT RĂDĂCINĂ DETECTAT:, COLOANE) stripped so the model has to infer period and account root from raw text. Pattern F samples render the same underlying YTD value through both the F4 (subtractive) and F5 (composed) recipes, teaching the model that the two are interchangeable when both column families exist. Trap-column samples explicitly teach avoidance: even when Sume totale is the most visible column, the model picks the right one. 14 real production fișe are replicated ×30 as anchors against tokenization drift.
Eval results
| eval set | accuracy |
|---|---|
| 14 real production fișe | 100 % (14 / 14) |
| 320 validation samples | 99 % |
| 360 held-out samples | 97 % |
Per-pattern accuracy on the 320 validation set:
| pattern | accuracy |
|---|---|
| A: full year, paired suffix headers | 97 % |
| B: full year, named + empty headers | 100 % |
| C: headerless | 100 % |
D: Total rulaje column present |
100 % |
E: with Tip partner-type column |
100 % |
| F: single month (Caz 1) | 100 % |
Worked examples (real production fișe)
Pattern A: Uzina Brașov, full year, paired suffix headers
Input (extracted from PDF):
PERIOADĂ DETECTATĂ: 01.01.2025 -- 31.10.2025
CONT RĂDĂCINĂ DETECTAT: 4111
COLOANE (10 total, 4 numerice):
[4] Rulaje perioada Dt
[5] Rulaje perioada Cr
[6] Sume totale Dt <- trap
[7] Sume totale Cr <- trap
{
"ytd_debit": { "formula": "ytd_source", "ytd_source": { "header": "Rulaje perioada Dt", "index": 4 } },
"ytd_credit": { "formula": "ytd_source", "ytd_source": { "header": "Rulaje perioada Cr", "index": 5 } }
}
Period starts 01.01, so this is Caz 2: direct read. Model picks Rulaje perioada (4/5) and avoids the visually-similar Sume totale trap (6/7).
Pattern B: Andisol, full year, named-and-empty header pair
Input:
PERIOADĂ DETECTATĂ: 01.01.2025 -- 31.05.2025
CONT RĂDĂCINĂ DETECTAT: 4111
COLOANE:
[2] Sume precedente <- trap (prior period total)
[4] Rulaje perioada
[5] (empty)
[6] Sume totale <- trap
{
"ytd_debit": { "formula": "ytd_source", "ytd_source": { "header": "", "index": 4 } },
"ytd_credit": { "formula": "ytd_source", "ytd_source": { "header": "", "index": 5 } }
}
The Debit column has a header Rulaje perioada but the Credit column it spans has an empty header (the software collapses paired columns). The model correctly emits "header": "" for the empty cell rather than fabricating a name. It also skips Sume precedente (the prior-period trap) and Sume totale.
Pattern C: Moise, headerless Caz 2
Input:
PERIOADĂ DETECTATĂ: 01.01.2024 -- 31.12.2024
CONT RĂDĂCINĂ DETECTAT: 4111
COLOANE (12 total, 0 numerice):
[0] Cont
[1] Denumirea contului
[2..11] (all empty)
RÂNDURI: 4111 CLIENTI 0.00 0.00 0.00 0.00 5 850 344.52 5 650 175.52 ...
{
"ytd_debit": { "formula": "ytd_source", "ytd_source": { "header": "", "index": 6 } },
"ytd_credit": { "formula": "ytd_source", "ytd_source": { "header": "", "index": 7 } }
}
All headers empty; only the row data reveals column positions. The model identifies columns 6 and 7 by counting numeric cells in the data rows, not by header text. This is the layout where header-matching general models tend to fabricate plausible-looking but invented column names.
Pattern D: Bughero, layout with Total rulaje instead of Rulaje perioada
Input:
PERIOADĂ DETECTATĂ: 01.01.2025 -- 31.12.2025
CONT RĂDĂCINĂ DETECTAT: 4111
COLOANE:
[4] Rulaje perioada Debitoare <- single period, NOT YTD here
[5] Rulaje perioada Creditoare
[6] Total rulaje an Debitoare <- this is YTD
[7] Total rulaje an Creditoare
[8] Sume totale Debit <- trap
{
"ytd_debit": { "formula": "ytd_source", "ytd_source": { "header": "Total rulaje an Debitoare", "index": 6 } },
"ytd_credit": { "formula": "ytd_source", "ytd_source": { "header": "Total rulaje an Creditoare", "index": 7 } }
}
When Total rulaje is present in the same fișă as Rulaje perioada, Total rulaje is the YTD column. The model picks 6/7 over the more visually familiar 4/5.
Pattern E: Skyline, partner-type column present
Input:
PERIOADĂ DETECTATĂ: Ianuarie 2025 - August 2025
CONT RĂDĂCINĂ DETECTAT: 411
COLOANE (7 total):
[2] Tip <- partner type (Persoană fizică / juridică)
[4] Rulaj curent | debit
[5] Rulaj curent | credit
{
"ytd_debit": { "formula": "ytd_source", "ytd_source": { "header": "Rulaj curent debit", "index": 4 } },
"ytd_credit": { "formula": "ytd_source", "ytd_source": { "header": "Rulaj curent credit", "index": 5 } },
"tip_column": { "header": "Tip", "index": 2 }
}
When a partner-type column exists, the model emits tip_column so the post-processor can join on partner type for VAT-split reporting.
Pattern F: Algorithm, single month (Caz 1)
Input:
PERIOADĂ DETECTATĂ: Perioada 9 <- single month, not 01.01-...
CONT RĂDĂCINĂ DETECTAT: 4111
COLOANE (15 total):
[4] Sold initial | Debit
[5] Sold initial | Credit
[6] Perioada precedente | Debit <- F5 addend1
[7] Perioada precedente | Credit
[8] Perioada | Debit <- F5 addend2
[9] Perioada | Credit
[10] Total sume | Debit <- F4 source
[11] Total sume | Credit
{
"ytd_debit": {
"formula": "ytd_source - sold_initial",
"ytd_source": { "header": "Total sume Debit", "index": 10 },
"sold_initial": { "header": "Sold initial Debit", "index": 4 }
},
"ytd_credit": {
"formula": "ytd_source - sold_initial",
"ytd_source": { "header": "Total sume Credit", "index": 11 },
"sold_initial": { "header": "Sold initial Credit", "index": 5 }
}
}
Perioada 9 means "month 9", not "from Jan to month 9". This is Caz 1: the YTD cannot be read directly. The model emits the subtractive recipe (F4): Total sume - Sold initial. The composed recipe (F5) would yield the same number: Perioada precedente + Perioada.
Not applicable: non-receivable/payable account
Input:
PERIOADĂ DETECTATĂ: 01.01.2025 -- 31.12.2025
CONT RĂDĂCINĂ DETECTAT: 5121 <- bank account, not 411/401
{ "not_applicable": true, "reason": "cont rădăcină 5121 nu este 411x sau 401x" }
When the account root is not a receivable (411x) or payable (401x), the model refuses with a structured reason rather than fabricating a YTD recipe.
Decision rules (what the model learned)
| input signal | output |
|---|---|
Period starts 01.01 (full year or Jan to month N) |
Caz 2: direct ytd_source from YTD column |
Single month / "Perioada N", with Total rulaje column |
Caz 1: direct ytd_source from Total rulaje |
Single month, with Perioada precedentă + Perioada |
Caz 1: addend1 + addend2 |
Single month, with Sume totale / Rulaj+SI |
Caz 1: ytd_source - sold_initial |
| Cont rădăcină starts 411 | side = Debit |
| Cont rădăcină starts 401 | side = Credit |
| Cont rădăcină 404 / 461 / 5121 / 4426 / etc. | not_applicable |
Trap columns that the model never picks as ytd_source directly: Sume totale, Total sume, Rulaj+SI, Sume precedente (all include opening balance).
Output schema
| field | type | content |
|---|---|---|
ytd_debit.formula |
str | one of ytd_source, ytd_source - sold_initial, addend1 + addend2 |
ytd_debit.ytd_source |
{header, index} |
always present |
ytd_debit.sold_initial |
{header, index} |
only for subtractive formula |
ytd_debit.addend1, addend2 |
{header, index} |
only for composed formula |
ytd_credit |
same shape as ytd_debit |
always present (or not_applicable) |
tip_column |
{header, index} |
optional, only when partner-type column exists |
not_applicable |
true |
when account root is not 411x or 401x |
Quick start
transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("surogate/Qwen3.5-2B-Libra-YTD", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
"surogate/Qwen3.5-2B-Libra-YTD",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto",
)
# System prompt comes from the dataset's `instruction` field; load any sample to use it.
from datasets import load_dataset
SYSTEM = load_dataset("surogate/ytd-dataset", split="train[:1]")[0]["instruction"]
user_text = open("my_fisa.txt").read()
messages = [{"role": "user", "content": f"{SYSTEM}\n{user_text}"}]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tok(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
out = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
vLLM
vllm serve surogate/Qwen3.5-2B-Libra-YTD \
--max-model-len 4096 \
--gpu-memory-utilization 0.85 \
--language-model-only
curl -s http://localhost:8000/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "surogate/Qwen3.5-2B-Libra-YTD",
"messages": [{"role": "user", "content": "<system prompt>\n<fișă text>"}],
"temperature": 0,
"max_tokens": 1024
}'
Training details
| field | value |
|---|---|
| base model | Qwen/Qwen3.5-2B-Base |
| method | LoRA SFT, merged into base for shipping |
| steps | 300 |
| LR | 5e-5 cosine, warmup ratio 0.05 |
| recipe | fp8-hybrid |
| batch | 4 × grad-accum 2, sequence_len 2048 |
| LoRA rank / alpha / dropout | 32 / 64 / 0.15 |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj, in_proj_qkv, in_proj_a, in_proj_b, in_proj_z, out_proj, gate_proj, up_proj, down_proj |
| dataset | surogate/ytd-dataset (3620 train + 320 val) |
| framework | surogate sft |
The Qwen3.5-2B base uses a hybrid Gated DeltaNet + Gated Attention architecture, which has 12 distinct LoRA-targetable projection modules (not the usual 7). Training without the extra in_proj_* and out_proj targets leaves most of the model's adaptive capacity untouched.
Limitations
- Romanian only. The
mixed_languagetraining format introduces some English headers but the model is not robust to fully English fișe. - Receivables (411x) and payables (401x) are the main focus. Other account roots (404, 461, 5121, 4426, ...) return
not_applicablerather than a YTD recipe. - Inputs longer than 2048 tokens were truncated during training. Very long fișe should be passed first-N-rows-truncated at inference, or with
max_model_len4096 if the fișă fits. - The 14 real production anchors are replicated rather than augmented. Generalization to truly novel layouts relies on the synthetic 12-format mix; layouts unlike any of A through F are not guaranteed.
- The post-processor that consumes the recipe is not part of this checkpoint. Without it, the model output is a recipe, not a number.
License
Apache 2.0. Inherits from Qwen/Qwen3.5-2B-Base. Synthetic training data plus 14 anonymized real-fișă layout anchors.
Citation
@misc{qwen35-2b-libra-ytd,
title = {Qwen3.5-2B-Libra-YTD: Romanian fișă analitică YTD recipe extractor},
author = {Surogate},
year = {2026},
url = {https://huggingface.co/surogate/Qwen3.5-2B-Libra-YTD}
}
- Downloads last month
- 690
Model tree for surogate/Qwen3.5-2B-Libra-YTD
Base model
Qwen/Qwen3.5-2B-Base