md896 commited on
Commit
279d788
·
1 Parent(s): 3d1b780

README: add model card highlights section and metadata snapshot.

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -30,6 +30,19 @@ Deterministic OpenEnv benchmark for real SQL debugging workflows. This project e
30
  - GitHub: [https://github.com/mdayan8/sql-debug-env](https://github.com/mdayan8/sql-debug-env)
31
  - W&B dashboard: [https://wandb.ai/mdayanbag-pesitm/sql-debug-grpo-best-budget/workspace?nw=nwusermdayanbag](https://wandb.ai/mdayanbag-pesitm/sql-debug-grpo-best-budget/workspace?nw=nwusermdayanbag)
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ## Problem and Motivation
34
 
35
  SQL debugging is expensive, repetitive, and operationally risky:
 
30
  - GitHub: [https://github.com/mdayan8/sql-debug-env](https://github.com/mdayan8/sql-debug-env)
31
  - W&B dashboard: [https://wandb.ai/mdayanbag-pesitm/sql-debug-grpo-best-budget/workspace?nw=nwusermdayanbag](https://wandb.ai/mdayanbag-pesitm/sql-debug-grpo-best-budget/workspace?nw=nwusermdayanbag)
32
 
33
+ ## Model Card Highlights
34
+
35
+ Model: [md896/sql-debug-agent-qwen25-05b-grpo-wandb-continue-v2](https://huggingface.co/md896/sql-debug-agent-qwen25-05b-grpo-wandb-continue-v2)
36
+
37
+ | Field | Value |
38
+ |---|---|
39
+ | Task | Text generation (SQL repair style prompts) |
40
+ | Libraries | Transformers, TRL (GRPO), Safetensors, TGI-compatible |
41
+ | Family tags | qwen2, grpo, conversational, text-generation-inference |
42
+ | Base tracks used in workflow | Qwen2.5-Coder 0.5B bridge + Qwen2.5-Coder 7B benchmark/eval track |
43
+ | Training signal | Execution-grounded reward from OpenEnv SQL tasks |
44
+ | Reference | arXiv:1910.09700 (as listed in model metadata) |
45
+
46
  ## Problem and Motivation
47
 
48
  SQL debugging is expensive, repetitive, and operationally risky: