FastContext-1.0-4B-RL in production agent integration

#1
by ghostwithahat - opened

I tested FastContext-1.0-4B-RL as an explore_repository subagent in my Go coding agent (ahle). Served via llama.cpp on RTX 3090, unquantized, temp=0.0, 104K ctx. Compared head-to-head against FastContext-1.0-4B-SFT on the same codebase (~450 files, ~90K LOC Go).

RL improvements over SFT:

  • 2–3× faster per loop (8–14s vs 19–26s)
  • Better citation precision overall
  • Main agent re-reads cited files ~60% less often
  • Fewer hallucinated file paths

Remaining issues (both variants):

  • <final_answer> XML tags are unreliable — the model often produces correct citation text but omits the tags. I had to add a regex fallback for lines matching /path/file.go:line-range.
  • Path hallucination reappears even at temp=0.0 (e.g. /home/ss/ahle/ vs the correct /home/ss/ai/ahle/), likely triggered by long directory listings in the system prompt.
  • Line ranges remain broad (file.go:1-280 when the answer is on lines 5–20).
  • Performance varies run-to-run: 11 tool calls in the best test, 13 in the worst. SFT averaged 29.

Bottom line: FastContext-1.0-4B-RL is a clear step up from SFT and usable with a fallback parser. The tag consistency and citation precision would benefit from more RL iterations. Thanks for releasing both variants — looking forward to future iterations.

I'll keep the FastContext subagant active in my setup and will report back if I gather more systematic data over time.

Thank's a lot and best wishes from lower franconia.

Sign up or log in to comment