BFCL v2 strict-mix k160 r32 LoRA adapter

LoRA adapter for Qwen/Qwen3-8B, trained for the circuit-shotting BFCL single-call experiments.

Run details:

  • dataset: data/bfcl_strict_10k_mix/train.jsonl
  • mask attribution: runs/issue49_bfcl_single_call/relp_full.npz
  • mask size: 160000 MLP channels
  • LoRA rank: 32
  • LoRA alpha: 64
  • max sequence length: 1024
  • training output: runs/issue49_bfcl_single_call/lora_bfcl10k_topk160000_r32_balanced_len1024_ckpt

Eval on the 1007-row BFCL single-turn/single-call slice, with canonicalization prompt and normalized exact matching:

condition score
unmasked merged model 683/1007 = 67.83%
k160 masked merged model 418/1007 = 41.51%

Included files:

  • adapter_model.safetensors: PEFT LoRA adapter weights
  • adapter_config.json: PEFT adapter config
  • tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer artifacts
  • train_summary.json: training summary
  • eval_masked_summary.json: k160 masked eval summary
  • eval_unmasked_summary.json: unmasked eval summary
Downloads last month
81
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Occupying-Mars/issue49-k160-r32-len1024-adapter

Finetuned
Qwen/Qwen3-8B
Adapter
(1449)
this model