FastContext-1.0-4B-SFT in production agent integration

#1
by ghostwithahat - opened

Great Idea! Thank you!

I integrated FastContext-1.0-4B-SFT as an explore_repository subagent into a production Go-based coding agent (ahle). Served unquantized via llama.cpp on RTX 3090 (104K ctx, 80 GPU layers, temperature 0.0). Findings after ~30 real-world runs:

What works:

  • Tool selection is good — the model prefers grep first, then targeted reads
  • With a directory listing in the system prompt (as in system.md), path hallucination drops to near zero
  • The model finds the right files roughly 60% of the time

What doesn't:

  • <final_answer> tags are inconsistent — the model often writes correct citation text but omits the XML wrapper. I had to add a regex fallback to extract bare /path/file.go:42-58 (reason) lines.
  • Line ranges are too broad (file.go:1-500) even when the answer spans 20 lines. The main agent (DeepSeek v4) re-reads cited files manually because it cannot trust coarse ranges. Net token savings: ~0%.
  • A "last turn" reminder system message helped, but only partially.

I'm testing the 4B-RL variant next, hoping the format penalties and line-level F1 reward produce tighter citations. Happy to share comparison results.

Microsoft org

thanks for the feedback,
since this basic version of the model (4B) was only fine-tuned (SFT) on a 3k trajectory dataset, hallucinations regarding file paths are unfortunately expected. A much more robust version will be updated soon.
if you can share your specific bad cases here (repos), it would really help us improve.

Sign up or log in to comment