linkd-dsl-qwen3-4b-lora

LoRA adapter (r=32, all-linear) for Qwen/Qwen3-4B-Instruct-2507 that turns free-form people-search queries into MongoDB find filters for the Berkeley.profilematch collection (the linkd-search DSL).

Post-trained with SFT (3,307 execution-validated gpt-5.5-high outputs) followed by GRPO against a result-grounded gpt-oss-120b judge with hard penalties for invalid DSL, zero-result filters, and match-everyone breadth.

Held-out eval (200 queries, same judge for all systems):

system	reward	valid DSL	DSL latency p50
this model	0.1507	99.0%	0.76 s (RTX 4090)
gpt-5.5-medium	0.1043	95.5%	13.1 s
gpt-4o-mini (prev. prod)	0.0556	74.0%	2.2 s

Usage

Serve with vLLM (OpenAI-compatible):

vllm serve Qwen/Qwen3-4B-Instruct-2507 \
  --enable-lora --lora-modules linkd-dsl=ericmao/linkd-dsl-qwen3-4b-lora \
  --max-model-len 2048 \
  --speculative-config '{"method":"ngram","num_speculative_tokens":8,"prompt_lookup_max":4,"prompt_lookup_min":2}'

Then call it with the exact production prompt (see the linkd-search repo, slm/common.py:SYSTEM_PROMPT), model="linkd-dsl", temperature=0. The response is a raw JSON Mongo filter; run it as collection.find(filter).limit(20).

A merged full-weights variant (no LoRA runtime needed) is published at ericmao/linkd-dsl-qwen3-4b.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ericmao/linkd-dsl-qwen3-4b-lora

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5527)

this model