syntheticlab/diff-apply · Hugging Face

Run diff-apply on Synthetic!

diff-apply is a small, fast model to apply search-and-replace style diffs to code. Many coding agents use search-and-replace as the default first-line edit format; however, for larger diffs, it's common for even fairly good coding models to produce slightly inaccurate search strings, i.e. with spacing slightly off or with other minor differences.

Typically, coding agents will do the following to handle those errors:

Hardcode heuristics to attempt to apply certain classes of incorrect diffs
If that fails, retry calling the original model, which can be expensive and slow, and also can confuse the original model as to what the current file state is.

diff-apply is trained to fix those diffs, allowing coding agents to typically skip (expensive, slow, model-confusing) retries, much more robustly than hardcoded heuristics. It's a Llama 3.1 8b LoRA, which means you can run it cheaply on inference providers like Synthetic.

The model expects prompts in the following format:

This edit is invalid; please fix it. The search string does not match perfectly with the file contents.
Respond only with JSON, and only with the edit JSON, not the original file.
If the edit is ambiguous, respond with null.
{
  "file": "the-underlying-file",
  "edit": {
    "type": "diff",
    "search": "the-search-string",
    "replace": "the-replace-string",
  }
}

It will respond with the fixed edit in JSON.

Training details

Trained on a single H100 NVL for 9 hours using Axolotl.