Update README.md for checkpoint 12k-v2
Browse files
README.md
CHANGED
|
@@ -33,13 +33,16 @@ library_name: nv-medtech
|
|
| 33 |
# Cosmos-H-Surgical-Simulator
|
| 34 |
|
| 35 |
## Description
|
| 36 |
-
Cosmos-H-Surgical-Simulator is a surgical world foundation model fine-tuned on the Open-H embodiment
|
| 37 |
-
This model assists in evaluating surgical robotics policies in simulation, primarily for CMR Surgical Versius clinical procedures (cholecystectomy, prostatectomy, inguinal hernia, and hysterectomy), as well as dVRK, MITIC, and other surgical platforms across tasks such as suturing, tissue manipulation, and peg transfer, before transitioning to a physical system.
|
| 38 |
|
| 39 |
-
The
|
| 40 |
|
| 41 |
This model is for commercial/non-commercial use.
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
## License/Terms of Use
|
| 44 |
Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
|
| 45 |
|
|
@@ -146,26 +149,24 @@ Dataset: Open-H-Embodiment community generated dataset.
|
|
| 146 |
Evaluated on 4 CMR Versius clinical surgery procedures (prostatectomy, inguinal hernia, hysterectomy, cholecystectomy) at 360p resolution, 2 episodes per procedure, 2 seeds each, 72-frame autoregressive generation (6 chunks Γ 12 frames).
|
| 147 |
|
| 148 |
### Aggregate Metrics
|
| 149 |
-
| Checkpoint
|
| 150 |
-
|:---
|
| 151 |
-
|
|
| 152 |
-
|
|
| 153 |
-
|
|
| 154 |
-
| **16k** | **0.2227** | **0.4167** | **83.68** |
|
| 155 |
-
| 20k | **0.2219** | 0.4058 | 124.96 |
|
| 156 |
|
| 157 |
**Metrics:**
|
| 158 |
- **FDS (L1)**: Frame Decay Score - mean L1 distance between generated and ground-truth frames normalized to [-1, 1], averaged across all generated frames (lower is better)
|
| 159 |
- **GATC**: Ground-truth Anchored Tool Consistency - median zero-mean normalized cross-correlation (ZNCC) of grayscale pixels within SAM3-segmented tool regions between generated and ground-truth frames, weighted by a gradient-based tool presence penalty (higher is better)
|
| 160 |
- **TCD**: Tool Centroid Distance - median per-frame average Euclidean distance (in pixels) between Hungarian-matched tool instance centroids in generated vs ground-truth frames, with a half-diagonal penalty for unmatched tools (lower is better)
|
| 161 |
|
| 162 |
-
### Per-Procedure Metrics (
|
| 163 |
| Procedure | FDS (L1) β | GATC β | TCD (px) β |
|
| 164 |
|:---|:---:|:---:|:---:|
|
| 165 |
-
| Prostatectomy | 0.
|
| 166 |
-
| Inguinal Hernia | 0.
|
| 167 |
-
| Hysterectomy | 0.
|
| 168 |
-
| Cholecystectomy | 0.
|
| 169 |
|
| 170 |
## Inference
|
| 171 |
**Acceleration Engine:** [PyTorch](https://pytorch.org/), [Transformer Engine](https://github.com/NVIDIA/TransformerEngine)
|
|
|
|
| 33 |
# Cosmos-H-Surgical-Simulator
|
| 34 |
|
| 35 |
## Description
|
| 36 |
+
Cosmos-H-Surgical-Simulator is a kinematic action-conditioned surgical world foundation model, built on the public NVIDIA [Cosmos-Predict2.5-2B](https://huggingface.co/nvidia/Cosmos-Predict2.5-2B) for physical AI and fine-tuned on the Open-H multi-embodiment surgical benchmark. Unlike the text-conditioned base model, it is driven directly by robot kinematics: given a surgical context frame and a sequence of 44-dimensional action vectors encoding end-effector poses and gripper commands (unified across 9 embodiments), it generates future video of the resulting surgical scene.
|
|
|
|
| 37 |
|
| 38 |
+
The model is intended for evaluating surgical robotics policies in simulation and for synthetic data generation prior to deployment on a physical system. It covers CMR Surgical Versius clinical procedures (cholecystectomy, prostatectomy, inguinal hernia, hysterectomy) as well as dVRK, MITIC, and other surgical platforms across tasks such as suturing, tissue manipulation, and peg transfer.
|
| 39 |
|
| 40 |
This model is for commercial/non-commercial use.
|
| 41 |
|
| 42 |
+
## Updates
|
| 43 |
+
|
| 44 |
+
- **April 2026** β Released updated checkpoint after fixing an action-embedder MLP initialization bug. Aggregate quality improves on all three metrics: FDS (L1) 0.223 β **0.184** (β17%), GATC 0.417 β **0.472** (+13%), TCD 83.68 β **67.03** (β20%).
|
| 45 |
+
|
| 46 |
## License/Terms of Use
|
| 47 |
Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
|
| 48 |
|
|
|
|
| 149 |
Evaluated on 4 CMR Versius clinical surgery procedures (prostatectomy, inguinal hernia, hysterectomy, cholecystectomy) at 360p resolution, 2 episodes per procedure, 2 seeds each, 72-frame autoregressive generation (6 chunks Γ 12 frames).
|
| 150 |
|
| 151 |
### Aggregate Metrics
|
| 152 |
+
| Checkpoint | FDS (L1) β | GATC β | TCD (px) β |
|
| 153 |
+
|:---|:---:|:---:|:---:|
|
| 154 |
+
| Previous (16k, pre-fix) | 0.223 | 0.417 | 83.68 |
|
| 155 |
+
| **Current (12k-v2, post-fix)** | **0.184** | **0.472** | **67.03** |
|
| 156 |
+
| Relative change | **β17%** | **+13%** | **β20%** |
|
|
|
|
|
|
|
| 157 |
|
| 158 |
**Metrics:**
|
| 159 |
- **FDS (L1)**: Frame Decay Score - mean L1 distance between generated and ground-truth frames normalized to [-1, 1], averaged across all generated frames (lower is better)
|
| 160 |
- **GATC**: Ground-truth Anchored Tool Consistency - median zero-mean normalized cross-correlation (ZNCC) of grayscale pixels within SAM3-segmented tool regions between generated and ground-truth frames, weighted by a gradient-based tool presence penalty (higher is better)
|
| 161 |
- **TCD**: Tool Centroid Distance - median per-frame average Euclidean distance (in pixels) between Hungarian-matched tool instance centroids in generated vs ground-truth frames, with a half-diagonal penalty for unmatched tools (lower is better)
|
| 162 |
|
| 163 |
+
### Per-Procedure Metrics (current checkpoint)
|
| 164 |
| Procedure | FDS (L1) β | GATC β | TCD (px) β |
|
| 165 |
|:---|:---:|:---:|:---:|
|
| 166 |
+
| Prostatectomy | 0.220 | 0.451 | 122.0 |
|
| 167 |
+
| Inguinal Hernia | 0.199 | 0.429 | 143.2 |
|
| 168 |
+
| Hysterectomy | 0.121 | 0.737 | 12.7 |
|
| 169 |
+
| Cholecystectomy | 0.198 | 0.344 | 28.8 |
|
| 170 |
|
| 171 |
## Inference
|
| 172 |
**Acceleration Engine:** [PyTorch](https://pytorch.org/), [Transformer Engine](https://github.com/NVIDIA/TransformerEngine)
|