anonymousOwl commited on
Commit
84cdb6f
Β·
verified Β·
1 Parent(s): 2819217

Add model card

Browse files
Files changed (1) hide show
  1. README.md +181 -0
README.md ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: Qwen/Qwen3-4B-Instruct-2507
4
+ language:
5
+ - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - hydrology
10
+ - agent
11
+ - tool-use
12
+ - grpo
13
+ - reinforcement-learning
14
+ - qwen3
15
+ - ef5
16
+ - crest
17
+ - function-calling
18
+ datasets:
19
+ - chrimerss/hydro_cali_agent_example
20
+ ---
21
+
22
+ # HydroAgent β€” Qwen3-4B-Instruct fine-tuned for hydrologic model calibration
23
+
24
+ **HydroAgent** is a tool-using language model that calibrates the
25
+ [EF5/CREST](https://github.com/HyDROSLab/EF5) distributed hydrologic model.
26
+ Given a USGS streamflow gage and a precipitation-driven simulation, the agent
27
+ iteratively proposes physically plausible parameter sets, runs the simulator,
28
+ inspects the resulting NSE / peak / volume metrics, and revises until the
29
+ model fits the observations.
30
+
31
+ This release is the **GRPO step-100 checkpoint** of the SFT + RL pipeline
32
+ described in [chrimerss/HydroLLM](https://github.com/chrimerss/HydroLLM).
33
+
34
+ - **Base model:** [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
35
+ - **Training:** full fine-tuning, BF16, FSDP, no LoRA
36
+ - **RL framework:** [verl 0.5](https://github.com/volcengine/verl) GRPO with [SGLang](https://github.com/sgl-project/sglang) rollouts
37
+ - **Tool format:** Hermes-style `<tool_call>` JSON (Qwen3-Instruct native)
38
+ - **Hardware:** 4Γ— H100, ~30 min/step, K=6 rollouts Γ— max 50 multi-turn calls
39
+
40
+ ## How the agent works
41
+
42
+ The model has access to three tools and runs a multi-turn calibration loop:
43
+
44
+ | Tool | Purpose |
45
+ |---|---|
46
+ | `set_parameters` | Set 11 tunable CREST multipliers: `wm`, `b`, `im`, `ke`, `fc`, `under`, `leaki`, `alpha`, `beta`, `alpha0`, `iwu` |
47
+ | `run_simulation` | Execute EF5 with the current parameters and produce a hydrograph |
48
+ | `evaluate` | Score the latest run vs. observations: NSE, CC, KGE, peak ratio, lag |
49
+
50
+ Each rollout typically follows: `set_parameters β†’ run_simulation β†’ evaluate β†’ set_parameters β†’ …`
51
+ until NSE plateaus or the agent runs out of turns. Inputs to the agent are a
52
+ short system prompt describing the calibration task and a per-gage user
53
+ message with watershed metadata (basin area, lat/lon, time window).
54
+
55
+ ## Training data
56
+
57
+ Training calibrates the agent on **10 CONUS USGS gages** (basin areas
58
+ 539 – 2401 kmΒ²), each driven by **MRMS 1 km hourly precipitation** and
59
+ **hourly USGS streamflow observations** from 60-day windows selected to
60
+ contain a clear flood event (rising + receding limbs, edge-buffered).
61
+
62
+ | Gage ID | Basin (kmΒ²) | Lat | Lon | Window (UTC) |
63
+ |---|---:|---:|---:|---|
64
+ | 11383500 | 539 | 40.0140 | -121.9483 | 2018-05-19 β†’ 2018-07-17 |
65
+ | 11043000 | 575 | 33.4798 | -117.1439 | 2019-03-15 β†’ 2019-05-13 |
66
+ | 11152000 | 632 | 36.2805 | -121.3227 | 2018-05-29 β†’ 2018-07-27 |
67
+ | 02294781 | 1064 | 27.8245 | -81.8017 | 2018-04-29 β†’ 2018-06-27 |
68
+ | 02312000 | 1476 | 28.4800 | -82.1776 | 2018-11-15 β†’ 2019-01-13 |
69
+ | 07195430 | 1489 | 36.1086 | -94.5333 | 2018-01-04 β†’ 2018-03-04 |
70
+ | 11179000 | 1639 | 37.5871 | -121.9608 | 2018-06-03 β†’ 2018-08-01 |
71
+ | 14301000 | 1727 | 45.7040 | -123.7554 | 2018-09-11 β†’ 2018-11-09 |
72
+ | 14207500 | 1828 | 45.3507 | -122.6762 | 2018-04-09 β†’ 2018-06-07 |
73
+ | 11376000 | 2401 | 40.3871 | -122.2386 | 2018-09-21 β†’ 2018-11-19 |
74
+
75
+ **Held-out evaluation gages** (never seen during training):
76
+
77
+ | Gage ID | Basin (kmΒ²) | Lat | Lon | Window (UTC) |
78
+ |---|---:|---:|---:|---|
79
+ | 02338660 | 329 | 33.2357 | -84.9876 | 2018-07-01 β†’ 2018-08-31 |
80
+ | 01403060 | 2033 | 40.5511 | -74.5483 | 2018-11-11 β†’ 2019-01-09 |
81
+ | 06279500 | 40792 | 44.7585 | -108.1816 | 2018-06-13 β†’ 2018-08-11 |
82
+ | 07144100 | 3209 | 37.8831 | -97.4245 | 2019-03-30 β†’ 2019-05-28 |
83
+
84
+ The full training dataset (MRMS clips, USGS observations, basin metadata,
85
+ EF5 control template) is published as
86
+ [**chrimerss/hydro_cali_agent_example**](https://huggingface.co/datasets/chrimerss/hydro_cali_agent_example).
87
+
88
+ ## Reward
89
+
90
+ Two reward layers shape the policy:
91
+
92
+ **Per-turn (returned by tools):**
93
+
94
+ | Tool call | Reward |
95
+ |---|---|
96
+ | `set_parameters` (valid) | `+0.02` |
97
+ | `run_simulation` (valid) | `+0.05` |
98
+ | `evaluate` (valid) | `Ξ”NSE` (this turn βˆ’ previous best) |
99
+ | Any tool (invalid) | `βˆ’0.5` |
100
+
101
+ **Terminal (returned at end of trajectory):**
102
+
103
+ | Component | Value |
104
+ |---|---|
105
+ | Best NSE (clipped) | `[βˆ’1, 1]` |
106
+ | Target-met bonus | `+0.5` if best NSE > gage target |
107
+ | Iteration bonus | `+0.02 Γ— n_evaluates` |
108
+ | Improvement bonus | `+0.10 Γ— max(0, n_improvements βˆ’ 1)` |
109
+ | Empty-trajectory penalty | `βˆ’1.0` |
110
+
111
+ ## GRPO settings
112
+
113
+ | Setting | Value |
114
+ |---|---|
115
+ | Algorithm | GRPO (group-relative advantages) |
116
+ | K (rollouts per prompt) | 6 |
117
+ | Train batch size | 4 prompts (24 trajectories per step) |
118
+ | Max assistant turns | 50 |
119
+ | Learning rate | 1e-6 with 5% warmup |
120
+ | Entropy coefficient | 0.01 |
121
+ | KL loss coefficient | 0.05 (anchored to base policy) |
122
+ | Sampling | `temperature=1.0`, `top_p=0.95` |
123
+ | Steps in this checkpoint | **100** |
124
+
125
+ ## Quick start
126
+
127
+ ```python
128
+ from transformers import AutoModelForCausalLM, AutoTokenizer
129
+
130
+ repo = "anonymousOwl/HydroAgent"
131
+ tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
132
+ model = AutoModelForCausalLM.from_pretrained(repo, torch_dtype="bfloat16", device_map="auto")
133
+ ```
134
+
135
+ The model emits Hermes-style tool calls, e.g.:
136
+
137
+ ```
138
+ <tool_call>
139
+ {"name": "set_parameters", "arguments": {"wm": 1.0, "b": 1.0, "im": 0.5, ...}}
140
+ </tool_call>
141
+ ```
142
+
143
+ Parse with `tokenizer.apply_chat_template(..., tools=HYDRO_TOOLS)` and
144
+ dispatch each call to your EF5 sandbox. See
145
+ [`modal_app/eval.py`](https://github.com/chrimerss/HydroLLM/blob/main/modal_app/eval.py)
146
+ for a reference SGLang loop with retry-on-parse-failure logic.
147
+
148
+ For full reproduction (image, EF5 binary, multi-turn rollout, reward
149
+ computation), use the
150
+ [HydroLLM repository](https://github.com/chrimerss/HydroLLM).
151
+
152
+ ## Limitations
153
+
154
+ - Trained on **10 small/medium CONUS basins** (≀ 2401 kmΒ²) over short flood
155
+ windows. Generalization to large basins (> 3000 kmΒ²), arid catchments, or
156
+ out-of-CONUS regions is unverified.
157
+ - Calibrates **CREST parameter multipliers only** β€” does not modify routing,
158
+ initial conditions, or sub-basin structure.
159
+ - The agent depends on a working EF5 toolchain; the weights alone do not
160
+ perform calibration without the simulation environment in the loop.
161
+ - This is a research checkpoint, not a production tool. NSE on held-out
162
+ gages varies substantially with basin and event.
163
+
164
+ ## License
165
+
166
+ MIT β€” same as the upstream [HydroLLM repository](https://github.com/chrimerss/HydroLLM)
167
+ and the base [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
168
+
169
+ ## Citation
170
+
171
+ ```bibtex
172
+ @software{hydrollm2026,
173
+ title = {HydroLLM: Reinforcement Learning Fine-Tuning of LLMs with Hydrologic Simulation Feedback},
174
+ year = {2026},
175
+ url = {https://github.com/chrimerss/HydroLLM}
176
+ }
177
+ ```
178
+
179
+ ## Acknowledgement
180
+
181
+ Compute for this research was sponsored by [Modal](https://modal.com).