Instructions to use anonymousOwl/HydroAgent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use anonymousOwl/HydroAgent with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="anonymousOwl/HydroAgent")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("anonymousOwl/HydroAgent")
model = AutoModelForCausalLM.from_pretrained("anonymousOwl/HydroAgent")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use anonymousOwl/HydroAgent with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "anonymousOwl/HydroAgent"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "anonymousOwl/HydroAgent",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/anonymousOwl/HydroAgent

SGLang

How to use anonymousOwl/HydroAgent with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "anonymousOwl/HydroAgent" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "anonymousOwl/HydroAgent",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "anonymousOwl/HydroAgent" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "anonymousOwl/HydroAgent",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use anonymousOwl/HydroAgent with Docker Model Runner:
```
docker model run hf.co/anonymousOwl/HydroAgent
```

anonymousOwl commited on 8 days ago

Commit

84cdb6f

verified ·

1 Parent(s): 2819217

Add model card

Browse files

Files changed (1) hide show

README.md +181 -0

README.md ADDED Viewed

	@@ -0,0 +1,181 @@

+---
+license: mit
+base_model: Qwen/Qwen3-4B-Instruct-2507
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- hydrology
+- agent
+- tool-use
+- grpo
+- reinforcement-learning
+- qwen3
+- ef5
+- crest
+- function-calling
+datasets:
+- chrimerss/hydro_cali_agent_example
+---
+# HydroAgent — Qwen3-4B-Instruct fine-tuned for hydrologic model calibration
+**HydroAgent** is a tool-using language model that calibrates the
+[EF5/CREST](https://github.com/HyDROSLab/EF5) distributed hydrologic model.
+Given a USGS streamflow gage and a precipitation-driven simulation, the agent
+iteratively proposes physically plausible parameter sets, runs the simulator,
+inspects the resulting NSE / peak / volume metrics, and revises until the
+model fits the observations.
+This release is the **GRPO step-100 checkpoint** of the SFT + RL pipeline
+described in [chrimerss/HydroLLM](https://github.com/chrimerss/HydroLLM).
+- **Base model:** [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
+- **Training:** full fine-tuning, BF16, FSDP, no LoRA
+- **RL framework:** [verl 0.5](https://github.com/volcengine/verl) GRPO with [SGLang](https://github.com/sgl-project/sglang) rollouts
+- **Tool format:** Hermes-style `<tool_call>` JSON (Qwen3-Instruct native)
+- **Hardware:** 4× H100, ~30 min/step, K=6 rollouts × max 50 multi-turn calls
+## How the agent works
+The model has access to three tools and runs a multi-turn calibration loop:
+| Tool | Purpose |
+|---|---|
+| `set_parameters` | Set 11 tunable CREST multipliers: `wm`, `b`, `im`, `ke`, `fc`, `under`, `leaki`, `alpha`, `beta`, `alpha0`, `iwu` |
+| `run_simulation` | Execute EF5 with the current parameters and produce a hydrograph |
+| `evaluate` | Score the latest run vs. observations: NSE, CC, KGE, peak ratio, lag |
+Each rollout typically follows: `set_parameters → run_simulation → evaluate → set_parameters → …`
+until NSE plateaus or the agent runs out of turns. Inputs to the agent are a
+short system prompt describing the calibration task and a per-gage user
+message with watershed metadata (basin area, lat/lon, time window).
+## Training data
+Training calibrates the agent on **10 CONUS USGS gages** (basin areas
+539 – 2401 km²), each driven by **MRMS 1 km hourly precipitation** and
+**hourly USGS streamflow observations** from 60-day windows selected to
+contain a clear flood event (rising + receding limbs, edge-buffered).
+| Gage ID | Basin (km²) | Lat | Lon | Window (UTC) |
+|---|---:|---:|---:|---|
+| 11383500 |  539 | 40.0140 | -121.9483 | 2018-05-19 → 2018-07-17 |
+| 11043000 |  575 | 33.4798 | -117.1439 | 2019-03-15 → 2019-05-13 |
+| 11152000 |  632 | 36.2805 | -121.3227 | 2018-05-29 → 2018-07-27 |
+| 02294781 | 1064 | 27.8245 |  -81.8017 | 2018-04-29 → 2018-06-27 |
+| 02312000 | 1476 | 28.4800 |  -82.1776 | 2018-11-15 → 2019-01-13 |
+| 07195430 | 1489 | 36.1086 |  -94.5333 | 2018-01-04 → 2018-03-04 |
+| 11179000 | 1639 | 37.5871 | -121.9608 | 2018-06-03 → 2018-08-01 |
+| 14301000 | 1727 | 45.7040 | -123.7554 | 2018-09-11 → 2018-11-09 |
+| 14207500 | 1828 | 45.3507 | -122.6762 | 2018-04-09 → 2018-06-07 |
+| 11376000 | 2401 | 40.3871 | -122.2386 | 2018-09-21 → 2018-11-19 |
+**Held-out evaluation gages** (never seen during training):
+| Gage ID | Basin (km²) | Lat | Lon | Window (UTC) |
+|---|---:|---:|---:|---|
+| 02338660 |   329 | 33.2357 |  -84.9876 | 2018-07-01 → 2018-08-31 |
+| 01403060 |  2033 | 40.5511 |  -74.5483 | 2018-11-11 → 2019-01-09 |
+| 06279500 | 40792 | 44.7585 | -108.1816 | 2018-06-13 → 2018-08-11 |
+| 07144100 |  3209 | 37.8831 |  -97.4245 | 2019-03-30 → 2019-05-28 |
+The full training dataset (MRMS clips, USGS observations, basin metadata,
+EF5 control template) is published as
+[**chrimerss/hydro_cali_agent_example**](https://huggingface.co/datasets/chrimerss/hydro_cali_agent_example).
+## Reward
+Two reward layers shape the policy:
+**Per-turn (returned by tools):**
+| Tool call | Reward |
+|---|---|
+| `set_parameters` (valid) | `+0.02` |
+| `run_simulation` (valid) | `+0.05` |
+| `evaluate` (valid) | `ΔNSE` (this turn − previous best) |
+| Any tool (invalid) | `−0.5` |
+**Terminal (returned at end of trajectory):**
+| Component | Value |
+|---|---|
+| Best NSE (clipped) | `[−1, 1]` |
+| Target-met bonus | `+0.5` if best NSE > gage target |
+| Iteration bonus | `+0.02 × n_evaluates` |
+| Improvement bonus | `+0.10 × max(0, n_improvements − 1)` |
+| Empty-trajectory penalty | `−1.0` |
+## GRPO settings
+| Setting | Value |
+|---|---|
+| Algorithm | GRPO (group-relative advantages) |
+| K (rollouts per prompt) | 6 |
+| Train batch size | 4 prompts (24 trajectories per step) |
+| Max assistant turns | 50 |
+| Learning rate | 1e-6 with 5% warmup |
+| Entropy coefficient | 0.01 |
+| KL loss coefficient | 0.05 (anchored to base policy) |
+| Sampling | `temperature=1.0`, `top_p=0.95` |
+| Steps in this checkpoint | **100** |
+## Quick start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+repo = "anonymousOwl/HydroAgent"
+tokenizer = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(repo, torch_dtype="bfloat16", device_map="auto")
+```
+The model emits Hermes-style tool calls, e.g.:
+```
+<tool_call>
+{"name": "set_parameters", "arguments": {"wm": 1.0, "b": 1.0, "im": 0.5, ...}}
+</tool_call>
+```
+Parse with `tokenizer.apply_chat_template(..., tools=HYDRO_TOOLS)` and
+dispatch each call to your EF5 sandbox. See
+[`modal_app/eval.py`](https://github.com/chrimerss/HydroLLM/blob/main/modal_app/eval.py)
+for a reference SGLang loop with retry-on-parse-failure logic.
+For full reproduction (image, EF5 binary, multi-turn rollout, reward
+computation), use the
+[HydroLLM repository](https://github.com/chrimerss/HydroLLM).
+## Limitations
+- Trained on **10 small/medium CONUS basins** (≤ 2401 km²) over short flood
+  windows. Generalization to large basins (> 3000 km²), arid catchments, or
+  out-of-CONUS regions is unverified.
+- Calibrates **CREST parameter multipliers only** — does not modify routing,
+  initial conditions, or sub-basin structure.
+- The agent depends on a working EF5 toolchain; the weights alone do not
+  perform calibration without the simulation environment in the loop.
+- This is a research checkpoint, not a production tool. NSE on held-out
+  gages varies substantially with basin and event.
+## License
+MIT — same as the upstream [HydroLLM repository](https://github.com/chrimerss/HydroLLM)
+and the base [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
+## Citation
+```bibtex
+@software{hydrollm2026,
+  title  = {HydroLLM: Reinforcement Learning Fine-Tuning of LLMs with Hydrologic Simulation Feedback},
+  year   = {2026},
+  url    = {https://github.com/chrimerss/HydroLLM}
+}
+```
+## Acknowledgement
+Compute for this research was sponsored by [Modal](https://modal.com).