Instructions to use ericmao/linkd-dsl-qwen3-4b-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use ericmao/linkd-dsl-qwen3-4b-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Instruct-2507") model = PeftModel.from_pretrained(base_model, "ericmao/linkd-dsl-qwen3-4b-lora") - Notebooks
- Google Colab
- Kaggle
linkd-dsl-qwen3-4b-lora
LoRA adapter (r=32, all-linear) for Qwen/Qwen3-4B-Instruct-2507 that turns
free-form people-search queries into MongoDB find filters for the
Berkeley.profilematch collection (the linkd-search DSL).
Post-trained with SFT (3,307 execution-validated gpt-5.5-high outputs) followed by GRPO against a result-grounded gpt-oss-120b judge with hard penalties for invalid DSL, zero-result filters, and match-everyone breadth.
Held-out eval (200 queries, same judge for all systems):
| system | reward | valid DSL | DSL latency p50 |
|---|---|---|---|
| this model | 0.1507 | 99.0% | 0.76 s (RTX 4090) |
| gpt-5.5-medium | 0.1043 | 95.5% | 13.1 s |
| gpt-4o-mini (prev. prod) | 0.0556 | 74.0% | 2.2 s |
Usage
Serve with vLLM (OpenAI-compatible):
vllm serve Qwen/Qwen3-4B-Instruct-2507 \
--enable-lora --lora-modules linkd-dsl=ericmao/linkd-dsl-qwen3-4b-lora \
--max-model-len 2048 \
--speculative-config '{"method":"ngram","num_speculative_tokens":8,"prompt_lookup_max":4,"prompt_lookup_min":2}'
Then call it with the exact production prompt (see the linkd-search repo,
slm/common.py:SYSTEM_PROMPT), model="linkd-dsl", temperature=0. The
response is a raw JSON Mongo filter; run it as
collection.find(filter).limit(20).
A merged full-weights variant (no LoRA runtime needed) is published at
ericmao/linkd-dsl-qwen3-4b.
- Downloads last month
- -
Model tree for ericmao/linkd-dsl-qwen3-4b-lora
Base model
Qwen/Qwen3-4B-Instruct-2507