KoHRM-Text-1.4B FullSFT LFM25 Terminal ToolBench Epoch3
This repository is the third full-SFT epoch of KoHRM-Text-1.4B on the LFM25/ToolBench terminal dataset. It is a fine-tuned version of LLM-OS-Models/KoHRM-Text-1.4B and continues from the Epoch2 checkpoint.
Training
- Base model:
LLM-OS-Models/KoHRM-Text-1.4B - Parent checkpoint:
LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-LFM25-Terminal-ToolBench-Epoch2 - Dataset:
kohrm_sft_lfm25_terminal_toolbench_full_v1 - Training type: full-parameter SFT, not LoRA
- Total SFT epochs on this dataset:
3 - Epoch3 hardware:
8 x NVIDIA H200 - Global batch size:
180224tokens - Learning rate:
2e-5 - Epoch3 training wall time: about
3h 16m 30s - Final train loss:
0.5271
TB2-lite Result
Evaluation file:
tb2_lite/results/20260606T_kohrm_lfm25_epoch3_eval_sdpa8_b16/KoHRM-Text-1.4B-fullsft-lfm25-terminal-toolbench-epoch3-sdpa8-b16-nocompile-merged.json
| Checkpoint | Steps | Score | Cmd F1 | Precision | Recall | First Cmd | Valid JSON | Avg Pred Cmds | Sec/Step |
|---|---|---|---|---|---|---|---|---|---|
| Epoch1 | 303/303 | 38.56 | 0.3856 | 0.4262 | 0.4341 | 37.0% | 55.1% | 27.33 | 8.314 |
| Epoch2 | 303/303 | 45.90 | 0.4590 | 0.5031 | 0.5098 | 44.9% | 68.3% | 25.16 | 10.842 |
| Epoch3 | 303/303 | 43.57 | 0.4357 | 0.4703 | 0.5003 | 45.5% | 61.7% | 25.82 | 11.156 |
Score = 100 * avg_command_f1.
Interpretation
Epoch3 is not the current best KoHRM terminal checkpoint. It scored 43.57, which is -2.33 versus Epoch2 45.90. The representative KoHRM terminal checkpoint remains Epoch2.
What improved:
- First Cmd increased slightly from
44.9%to45.5%. model_trainingreached F10.4910.
What regressed:
- Cmd F1 fell from
0.4590to0.4357. - Precision fell from
0.5031to0.4703. - Valid JSON fell from
68.3%to61.7%.
Strong source groups:
data_querying:0.6150(15 steps, First Cmd53.3%)data_science:0.5940(22 steps, First Cmd63.6%)model_training:0.4910(17 steps, First Cmd17.6%)debugging:0.4727(38 steps, First Cmd52.6%)software_engineering:0.4591(36 steps, First Cmd55.6%)scientific_computing:0.4491(20 steps, First Cmd55.0%)
Weak source groups:
math:0.3150(16 steps, First Cmd25.0%)dependency_management:0.3580(15 steps, First Cmd33.3%)security:0.3580(23 steps, First Cmd34.8%)data_processing:0.3621(23 steps, First Cmd43.5%)swe:0.3815(23 steps, First Cmd39.1%)file_operations:0.3875(20 steps, First Cmd45.0%)
Usage Note
KoHRM-Text uses the HRM-Text PrefixLM runtime, not a standard vLLM chat-model path. For this evaluation the local HF export path was evaluated with KOHRM_FORCE_SDPA_KVCACHE=1 and KOHRM_DISABLE_INFERENCE_COMPILE=1 because the local flash-attention build does not support append-KV cache for this run.
- Downloads last month
- 32
Model tree for LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-LFM25-Terminal-ToolBench-Epoch3
Base model
LLM-OS-Models/KoHRM-Text-1.4B