Spaces:

Daankular
/

Image2Model

Sleeping

Daankular commited on 5 days ago

Commit

8f1bcd9

1 Parent(s): 5d73995

Port MeshForge features to ZeroGPU Space: FireRed, PSHuman, Motion Search

New tabs and features from the MeshForge server version, adapted for ZeroGPU:
- Edit tab: FireRed GGUF-quantized image editing (QwenImageEditPlusPipeline)
- Animate tab: HumanML3D motion search + GLB animation via Retarget/
- PSHuman Face tab: HD face transplant via PSHuman service + face_transplant.py
- Settings tab: VRAM management (preload/unload/refresh)
- Generate tab: remove-BG preview controls + FireRed→Generate flow
- Enhancement tab: unload button for VRAM management

New pipeline modules: face_transplant, pshuman_client, render_glb, tpose,
face_inswap_bake, face_project, face_swap_render, head_replace
New Retarget/ directory: motion search, animate, skeleton, SMPL retargeting
New utils/pytorch3d_minimal.py
Updated requirements.txt: bitsandbytes, gradio_client, filterpy, pytorch-lightning,
lightning-utilities, webdataset, hydra-core, matplotlib

Files changed (26) hide show

Retarget/README.md +127 -0
Retarget/__init__.py +49 -0
Retarget/animate.py +611 -0
Retarget/cli.py +129 -0
Retarget/generate.py +131 -0
Retarget/humanml3d_to_bvh.py +813 -0
Retarget/io/__init__.py +1 -0
Retarget/io/bvh.py +216 -0
Retarget/io/gltf_io.py +316 -0
Retarget/io/mapping.py +189 -0
Retarget/math3d.py +167 -0
Retarget/retarget.py +586 -0
Retarget/search.py +159 -0
Retarget/skeleton.py +165 -0
Retarget/smpl.py +184 -0
app.py +1079 -31
pipeline/face_inswap_bake.py +302 -0
pipeline/face_project.py +305 -0
pipeline/face_swap_render.py +293 -0
pipeline/face_transplant.py +667 -0
pipeline/head_replace.py +762 -0
pipeline/pshuman_client.py +283 -0
pipeline/render_glb.py +25 -0
pipeline/tpose.py +332 -0
requirements.txt +13 -1
utils/pytorch3d_minimal.py +242 -0

Retarget/README.md ADDED Viewed

	@@ -0,0 +1,127 @@

+# rig_retarget
+Pure-Python rig retargeting library. No Blender required.
+Based on **[KeeMap Blender Rig Retargeting Addon](https://github.com/nkeeline/Keemap-Blender-Rig-ReTargeting-Addon)** by [Nick Keeline](https://github.com/nkeeline) (GPL v2).
+All core retargeting math is a direct port of his work. Mapping JSON files are fully compatible with KeeMap.
+---
+## File layout
+```
+rig_retarget/
+├── math3d.py        # Quaternion / matrix math (numpy + scipy), replaces mathutils
+├── skeleton.py      # Armature + PoseBone with FK, replaces bpy armature objects
+├── retarget.py      # Core retargeting logic — faithful port of KeeMapBoneOperators.py
+├── cli.py           # CLI entry point
+└── io/
+    ├── bvh.py       # BVH mocap reader (source animation)
+    ├── gltf_io.py   # glTF/GLB reader + animation writer (UniRig destination)
+    └── mapping.py   # JSON bone mapping — same format as KeeMap
+```
+---
+## Install
+```bash
+pip install numpy scipy pygltflib
+```
+---
+## CLI
+```bash
+# Retarget BVH onto UniRig GLB
+python -m rig_retarget.cli \
+    --source  motion.bvh \
+    --dest    unirig_character.glb \
+    --mapping radical2unirig.json \
+    --output  animated_character.glb \
+    --fps 30 --start 0 --frames 200 --step 1
+# Auto-calculate bone correction factors and save back to the mapping file
+python -m rig_retarget.cli --calc-corrections \
+    --source motion.bvh \
+    --dest   unirig_character.glb \
+    --mapping mymap.json
+```
+---
+## Python API
+```python
+from rig_retarget.io.bvh     import load_bvh
+from rig_retarget.io.gltf_io import load_gltf, write_gltf_animation
+from rig_retarget.io.mapping  import load_mapping
+from rig_retarget.retarget    import transfer_animation, calc_all_corrections
+# Load
+settings, bone_items = load_mapping("my_map.json")
+src_anim = load_bvh("motion.bvh")
+dst_arm  = load_gltf("unirig_char.glb")
+# Optional: auto-calc corrections at first frame
+src_anim.apply_frame(0)
+calc_all_corrections(bone_items, src_anim.armature, dst_arm, settings)
+# Transfer
+settings.number_of_frames_to_apply = src_anim.num_frames
+keyframes = transfer_animation(src_anim, dst_arm, bone_items, settings)
+# Write output GLB
+write_gltf_animation("unirig_char.glb", dst_arm, keyframes, "output.glb")
+```
+---
+## Mapping JSON format
+100% compatible with KeeMap's `.json` files. Use KeeMap in Blender to create
+and tune mappings, then use this library offline for batch processing.
+Key fields per bone:
+| Field | Description |
+|---|---|
+| `SourceBoneName` | Bone name in the source rig (BVH joint name) |
+| `DestinationBoneName` | Bone name in the UniRig skeleton (glTF node name) |
+| `set_bone_rotation` | Drive rotation from source |
+| `set_bone_position` | Drive position from source |
+| `bone_rotation_application_axis` | Mask axes: `X` `Y` `Z` `XY` `XZ` `YZ` `XYZ` |
+| `bone_transpose_axis` | Swap axes: `NONE` `ZYX` `ZXY` `XZY` `YZX` `YXZ` |
+| `CorrectionFactorX/Y/Z` | Euler correction (radians) |
+| `postion_type` | `SINGLE_BONE_OFFSET` or `POLE` |
+| `position_pole_distance` | IK pole distance |
+---
+## Blender → pure-Python mapping
+| Blender | rig_retarget |
+|---|---|
+| `bpy.data.objects[name]` | `Armature` + `load_gltf()` / `load_bvh()` |
+| `arm.pose.bones[name]` | `arm.get_bone(name)` → `PoseBone` |
+| `bone.matrix` (pose space) | `bone.matrix_armature` |
+| `arm.matrix_world` | `arm.world_matrix` |
+| `arm.convert_space(...)` | `arm.world_matrix @ bone.matrix_armature` |
+| `bone.rotation_quaternion` | `bone.pose_rotation_quat` |
+| `bone.location` | `bone.pose_location` |
+| `bone.keyframe_insert(...)` | returned in `keyframes` list from `transfer_frame()` |
+| `bpy.context.scene.frame_set(i)` | `src_anim.apply_frame(i)` |
+| `mathutils.Quaternion` | `np.ndarray [w,x,y,z]` + `math3d.*` |
+| `mathutils.Matrix` | `np.ndarray (4,4)` |
+---
+## Limitations / TODO
+- **glTF source animation** reading not yet implemented (BVH only for now).
+  Add `io/gltf_anim_reader.py` reading `gltf.animations[0]` sampler data.
+- FBX source support: use `pyassimp` or `bpy` offline with `--background`.
+- IK solving: pole bone positioning is FK-only; a full IK solver (FABRIK/CCD)
+  would improve accuracy for limb targets.
+- Quaternion mode twist bones: parity with Blender not guaranteed for complex twist rigs.

Retarget/__init__.py ADDED Viewed

	@@ -0,0 +1,49 @@

+"""
+rig_retarget
+============
+Pure-Python rig retargeting library.
+No Blender dependency. Targets TripoSG meshes auto-rigged by UniRig (SIGGRAPH 2025).
+Quick start
+-----------
+    from rig_retarget.io.bvh import load_bvh
+    from rig_retarget.io.gltf_io import load_gltf, write_gltf_animation
+    from rig_retarget.io.mapping import load_mapping
+    from rig_retarget.retarget import transfer_animation
+    settings, bone_items = load_mapping("my_map.json")
+    src_anim = load_bvh("motion.bvh")
+    dst_arm  = load_gltf("unirig_char.glb")
+    keyframes = transfer_animation(src_anim, dst_arm, bone_items, settings)
+    write_gltf_animation("unirig_char.glb", dst_arm, keyframes, "output.glb")
+CLI
+---
+    python -m rig_retarget.cli --source motion.bvh --dest char.glb \\
+        --mapping map.json --output char_animated.glb
+"""
+from .skeleton import Armature, PoseBone
+from .retarget import (
+    get_bone_position_ws,
+    get_bone_ws_quat,
+    set_bone_position_ws,
+    set_bone_rotation,
+    set_bone_position,
+    set_bone_position_pole,
+    set_bone_scale,
+    calc_rotation_offset,
+    calc_location_offset,
+    calc_all_corrections,
+    transfer_frame,
+    transfer_animation,
+)
+__all__ = [
+    "Armature", "PoseBone",
+    "get_bone_position_ws", "get_bone_ws_quat", "set_bone_position_ws",
+    "set_bone_rotation", "set_bone_position", "set_bone_position_pole",
+    "set_bone_scale", "calc_rotation_offset", "calc_location_offset",
+    "calc_all_corrections", "transfer_frame", "transfer_animation",
+]

Retarget/animate.py ADDED Viewed

	@@ -0,0 +1,611 @@

+"""
+animate.py
+──────────────────────────────────────────────────────────────────────────────
+Bake SMPL motion (from HumanML3D [T, 263] features) onto a UniRig-rigged GLB.
+Retargeting method: world-direction matching
+────────────────────────────────────────────
+Commercial retargeters (Mixamo, Rokoko, MotionBuilder) avoid rest-pose
+convention mismatches by matching WORLD BONE DIRECTIONS, not local rotations.
+Algorithm (per frame, per bone):
+  1. Run t2m FK with HumanML3D 6D rotations → world bone direction d_t2m
+  2. Flip X axis: t2m +X = character's LEFT; SMPL/UniRig +X = character's RIGHT
+     So d_desired = (-d_t2m_x, d_t2m_y, d_t2m_z) in SMPL/UniRig world frame
+  3. d_rest = normalize(ur_pos[bone] - ur_pos[parent]) from GLB inverse bind matrices
+  4. R_world = R_between(d_rest, d_desired)  -- minimal rotation in world space
+  5. local_rot = inv(R_world[parent]) @ R_world[bone]
+  6. pose_rot_delta = inv(rest_r) @ local_rot  -- composing with glTF rest rotation
+This avoids all rest-pose convention issues:
+  - t2m canonical arms point DOWN: handled automatically
+  - t2m canonical hips/shoulders have inverted X: handled by the X-flip
+  - UniRig non-identity rest rotations: handled by inv(rest_r) composition
+Key bugs fixed vs previous version:
+  - IBM column-major: glTF IBMs are column-major; was using inv(ibm)[:3,3] (zeros).
+    Fixed to inv(ibm.T)[:3,3] which gives correct world-space bone positions.
+  - Normalisation: was mixing ur/smpl Y ranges, causing wrong height alignment.
+    Fixed with independent per-skeleton Y normalisation.
+  - Rotation convention: was applying t2m rotations directly without X-flip.
+    Fixed by world-direction matching with coordinate-frame conversion.
+"""
+from __future__ import annotations
+import os
+import re
+import numpy as np
+from typing import Union
+from .smpl import SMPLMotion, hml3d_to_smpl_motion
+# ──────────────────────────────────────────────────────────────────────────────
+# T2M (HumanML3D) skeleton constants
+# Source: HumanML3D/common/paramUtil.py
+# ──────────────────────────────────────────────────────────────────────────────
+T2M_RAW_OFFSETS = np.array([
+    [ 0, 0, 0],   # 0  Hips          (root)
+    [ 1, 0, 0],   # 1  LeftUpLeg     +X = character LEFT in t2m convention
+    [-1, 0, 0],   # 2  RightUpLeg
+    [ 0, 1, 0],   # 3  Spine
+    [ 0,-1, 0],   # 4  LeftLeg
+    [ 0,-1, 0],   # 5  RightLeg
+    [ 0, 1, 0],   # 6  Spine1
+    [ 0,-1, 0],   # 7  LeftFoot
+    [ 0,-1, 0],   # 8  RightFoot
+    [ 0, 1, 0],   # 9  Spine2
+    [ 0, 0, 1],   # 10 LeftToeBase
+    [ 0, 0, 1],   # 11 RightToeBase
+    [ 0, 1, 0],   # 12 Neck
+    [ 1, 0, 0],   # 13 LeftShoulder  +X = character LEFT
+    [-1, 0, 0],   # 14 RightShoulder
+    [ 0, 0, 1],   # 15 Head
+    [ 0,-1, 0],   # 16 LeftArm       arms hang DOWN in t2m canonical
+    [ 0,-1, 0],   # 17 RightArm
+    [ 0,-1, 0],   # 18 LeftForeArm
+    [ 0,-1, 0],   # 19 RightForeArm
+    [ 0,-1, 0],   # 20 LeftHand
+    [ 0,-1, 0],   # 21 RightHand
+], dtype=np.float64)
+T2M_KINEMATIC_CHAIN = [
+    [0, 2, 5, 8, 11],        # Hips -> RightUpLeg -> RightLeg -> RightFoot -> RightToe
+    [0, 1, 4, 7, 10],        # Hips -> LeftUpLeg  -> LeftLeg  -> LeftFoot  -> LeftToe
+    [0, 3, 6, 9, 12, 15],    # Hips -> Spine -> Spine1 -> Spine2 -> Neck -> Head
+    [9, 14, 17, 19, 21],      # Spine2 -> RightShoulder -> RightArm -> RightForeArm -> RightHand
+    [9, 13, 16, 18, 20],      # Spine2 -> LeftShoulder  -> LeftArm  -> LeftForeArm  -> LeftHand
+]
+# Parent joint index for each of the 22 t2m joints
+T2M_PARENTS = [-1] * 22
+for _chain in T2M_KINEMATIC_CHAIN:
+    for _k in range(1, len(_chain)):
+        T2M_PARENTS[_chain[_k]] = _chain[_k - 1]
+# ──────────────────────────────────────────────────────────────────────────────
+# SMPL joint names / T-pose (for bone mapping reference)
+# ──────────────────────────────────────────────────────────────────────────────
+SMPL_NAMES = [
+    "Hips",         "LeftUpLeg",    "RightUpLeg",   "Spine",
+    "LeftLeg",      "RightLeg",     "Spine1",       "LeftFoot",
+    "RightFoot",    "Spine2",       "LeftToeBase",  "RightToeBase",
+    "Neck",         "LeftShoulder", "RightShoulder","Head",
+    "LeftArm",      "RightArm",     "LeftForeArm",  "RightForeArm",
+    "LeftHand",     "RightHand",
+]
+# Approximate T-pose joint world positions in metres (Y-up, facing +Z)
+# +X = character's RIGHT (standard SMPL/UniRig convention)
+SMPL_TPOSE = np.array([
+    [ 0.000,  0.920,  0.000],  # 0  Hips
+    [-0.095,  0.920,  0.000],  # 1  LeftUpLeg   (character's left = -X)
+    [ 0.095,  0.920,  0.000],  # 2  RightUpLeg
+    [ 0.000,  0.980,  0.000],  # 3  Spine
+    [-0.095,  0.495,  0.000],  # 4  LeftLeg
+    [ 0.095,  0.495,  0.000],  # 5  RightLeg
+    [ 0.000,  1.050,  0.000],  # 6  Spine1
+    [-0.095,  0.075,  0.000],  # 7  LeftFoot
+    [ 0.095,  0.075,  0.000],  # 8  RightFoot
+    [ 0.000,  1.120,  0.000],  # 9  Spine2
+    [-0.095,  0.000, -0.020],  # 10 LeftToeBase
+    [ 0.095,  0.000, -0.020],  # 11 RightToeBase
+    [ 0.000,  1.370,  0.000],  # 12 Neck
+    [-0.130,  1.290,  0.000],  # 13 LeftShoulder
+    [ 0.130,  1.290,  0.000],  # 14 RightShoulder
+    [ 0.000,  1.500,  0.000],  # 15 Head
+    [-0.330,  1.290,  0.000],  # 16 LeftArm
+    [ 0.330,  1.290,  0.000],  # 17 RightArm
+    [-0.630,  1.290,  0.000],  # 18 LeftForeArm
+    [ 0.630,  1.290,  0.000],  # 19 RightForeArm
+    [-0.910,  1.290,  0.000],  # 20 LeftHand
+    [ 0.910,  1.290,  0.000],  # 21 RightHand
+], dtype=np.float32)
+# Name hint table: lowercase substrings -> SMPL joint index
+_NAME_HINTS: list[tuple[list[str], int]] = [
+    (["hips","pelvis","root"],                                          0),
+    (["leftupleg","l_upleg","leftthigh","lefthip","thigh_l"],           1),
+    (["rightupleg","r_upleg","rightthigh","righthip","thigh_r"],        2),
+    (["spine","spine0","spine_01"],                                     3),
+    (["leftleg","leftknee","lowerleg_l","knee_l"],                      4),
+    (["rightleg","rightknee","lowerleg_r","knee_r"],                    5),
+    (["spine1","spine_02"],                                             6),
+    (["leftfoot","l_foot","foot_l"],                                    7),
+    (["rightfoot","r_foot","foot_r"],                                   8),
+    (["spine2","spine_03","chest"],                                     9),
+    (["lefttoebase","lefttoe","l_toe","toe_l"],                        10),
+    (["righttoebase","righttoe","r_toe","toe_r"],                      11),
+    (["neck"],                                                         12),
+    (["leftshoulder","leftcollar","clavicle_l"],                       13),
+    (["rightshoulder","rightcollar","clavicle_r"],                     14),
+    (["head"],                                                         15),
+    (["leftarm","upperarm_l","l_arm"],                                 16),
+    (["rightarm","upperarm_r","r_arm"],                                17),
+    (["leftforearm","lowerarm_l","l_forearm"],                         18),
+    (["rightforearm","lowerarm_r","r_forearm"],                        19),
+    (["lefthand","hand_l","l_hand"],                                   20),
+    (["righthand","hand_r","r_hand"],                                  21),
+]
+# ──────────────────────────────────────────────────────────────────────────────
+# Quaternion helpers  (scalar-first WXYZ convention throughout)
+# ──────────────────────────────────────────────────────────────────────────────
+_ID_QUAT = np.array([1., 0., 0., 0.], dtype=np.float32)
+_ID_MAT3 = np.eye(3, dtype=np.float64)
+def _qmul(a: np.ndarray, b: np.ndarray) -> np.ndarray:
+    aw, ax, ay, az = a
+    bw, bx, by, bz = b
+    return np.array([
+        aw*bw - ax*bx - ay*by - az*bz,
+        aw*bx + ax*bw + ay*bz - az*by,
+        aw*by - ax*bz + ay*bw + az*bx,
+        aw*bz + ax*by - ay*bx + az*bw,
+    ], dtype=np.float32)
+def _qnorm(q: np.ndarray) -> np.ndarray:
+    n = np.linalg.norm(q)
+    return (q / n) if n > 1e-12 else _ID_QUAT.copy()
+def _qinv(q: np.ndarray) -> np.ndarray:
+    """Conjugate = inverse for unit quaternion."""
+    return q * np.array([1., -1., -1., -1.], dtype=np.float32)
+def _quat_to_mat(q: np.ndarray) -> np.ndarray:
+    """WXYZ quaternion -> 3x3 rotation matrix (float64)."""
+    w, x, y, z = q.astype(np.float64)
+    return np.array([
+        [1-2*(y*y+z*z),   2*(x*y-w*z),   2*(x*z+w*y)],
+        [  2*(x*y+w*z), 1-2*(x*x+z*z),   2*(y*z-w*x)],
+        [  2*(x*z-w*y),   2*(y*z+w*x), 1-2*(x*x+y*y)],
+    ], dtype=np.float64)
+def _mat_to_quat(m: np.ndarray) -> np.ndarray:
+    """3x3 rotation matrix -> WXYZ quaternion (float32, positive-W)."""
+    from scipy.spatial.transform import Rotation
+    xyzw = Rotation.from_matrix(m.astype(np.float64)).as_quat()
+    wxyz = np.array([xyzw[3], xyzw[0], xyzw[1], xyzw[2]], dtype=np.float32)
+    if wxyz[0] < 0:
+        wxyz = -wxyz
+    return wxyz
+def _r_between(u: np.ndarray, v: np.ndarray) -> np.ndarray:
+    """
+    Minimal rotation matrix (3x3) that maps unit vector u to unit vector v.
+    Uses the Rodrigues formula; handles parallel/antiparallel cases.
+    """
+    u = u / (np.linalg.norm(u) + 1e-12)
+    v = v / (np.linalg.norm(v) + 1e-12)
+    c = float(np.dot(u, v))
+    if c >= 1.0 - 1e-7:
+        return _ID_MAT3.copy()
+    if c <= -1.0 + 1e-7:
+        # 180 degree rotation: pick any perpendicular axis
+        perp = np.array([1., 0., 0.]) if abs(u[0]) < 0.9 else np.array([0., 1., 0.])
+        ax = np.cross(u, perp)
+        ax /= np.linalg.norm(ax)
+        return 2.0 * np.outer(ax, ax) - _ID_MAT3
+    ax = np.cross(u, v)                    # sin(theta) * rotation axis
+    s  = np.linalg.norm(ax)
+    K  = np.array([[    0, -ax[2],  ax[1]],
+                   [ ax[2],     0, -ax[0]],
+                   [-ax[1],  ax[0],     0]], dtype=np.float64)
+    return _ID_MAT3 + K + K @ K * ((1.0 - c) / (s * s + 1e-12))
+# ──────────────────────────────────────────────────────────────────────────────
+# GLB skin reader
+# ──────────────────────────────────────────────────────────────────────────────
+def _read_glb_skin(rigged_glb: str):
+    """
+    Return (gltf, skin, ibm[n,4,4], node_trs{name->(t,r_wxyz,s)},
+            bone_names[], bone_parent_map{name->parent_name_or_None}).
+    ibm is stored as-read from the binary blob (column-major from glTF spec).
+    Callers must use inv(ibm[i].T)[:3,3] to get correct world positions.
+    """
+    import base64
+    import pygltflib
+    gltf = pygltflib.GLTF2().load(rigged_glb)
+    if not gltf.skins:
+        raise ValueError(f"No skin found in {rigged_glb}")
+    skin = gltf.skins[0]
+    def _raw_bytes(buf):
+        if buf.uri is None:
+            return bytes(gltf.binary_blob())
+        if buf.uri.startswith("data:"):
+            return base64.b64decode(buf.uri.split(",", 1)[1])
+        from pathlib import Path
+        return (Path(rigged_glb).parent / buf.uri).read_bytes()
+    acc   = gltf.accessors[skin.inverseBindMatrices]
+    bv    = gltf.bufferViews[acc.bufferView]
+    raw   = _raw_bytes(gltf.buffers[bv.buffer])
+    start = (bv.byteOffset or 0) + (acc.byteOffset or 0)
+    n     = acc.count
+    ibm   = np.frombuffer(raw[start: start + n * 64], dtype=np.float32).reshape(n, 4, 4)
+    # Build node parent map (node_index -> parent_node_index)
+    node_parent: dict[int, int] = {}
+    for ni, node in enumerate(gltf.nodes):
+        for child_idx in (node.children or []):
+            node_parent[child_idx] = ni
+    joint_set     = set(skin.joints)
+    bone_names    = []
+    node_trs: dict[str, tuple] = {}
+    bone_parent_map: dict[str, str | None] = {}
+    for i, j_idx in enumerate(skin.joints):
+        node = gltf.nodes[j_idx]
+        name = node.name or f"bone_{i}"
+        bone_names.append(name)
+        t      = np.array(node.translation or [0., 0., 0.], dtype=np.float32)
+        r_xyzw = np.array(node.rotation    or [0., 0., 0., 1.], dtype=np.float32)
+        s      = np.array(node.scale       or [1., 1., 1.], dtype=np.float32)
+        r_wxyz = np.array([r_xyzw[3], r_xyzw[0], r_xyzw[1], r_xyzw[2]], dtype=np.float32)
+        node_trs[name] = (t, r_wxyz, s)
+        # Find parent bone (walk up node hierarchy to nearest joint)
+        parent_node = node_parent.get(j_idx)
+        parent_name: str | None = None
+        while parent_node is not None:
+            if parent_node in joint_set:
+                pnode = gltf.nodes[parent_node]
+                parent_name = pnode.name or f"bone_{skin.joints.index(parent_node)}"
+                break
+            parent_node = node_parent.get(parent_node)
+        bone_parent_map[name] = parent_name
+    print(f"[GLB] {len(bone_names)} bones from skin '{skin.name or 'Armature'}'")
+    return gltf, skin, ibm, node_trs, bone_names, bone_parent_map
+# ──────────────────────────────────────────────────────────────────────────────
+# Bone mapping
+# ──────────────────────────────────────────────────────────────────────────────
+def _strip_name(name: str) -> str:
+    name = re.sub(r'^(mixamorig:|j_bip_[lcr]_|cc_base_|bip01_|rig:|chr:)',
+                  "", name, flags=re.IGNORECASE)
+    return re.sub(r'[_\-\s.]', "", name).lower()
+def build_bone_map(
+    rigged_glb: str,
+    verbose: bool = True,
+) -> tuple[dict, dict, float, dict, dict]:
+    """
+    Map UniRig bone names -> SMPL joint index by spatial proximity + name hints.
+    Returns
+    -------
+    bone_to_smpl    : {bone_name: smpl_joint_index}
+    node_trs        : {bone_name: (t[3], r_wxyz[4], s[3])}
+    height_scale    : float  (UniRig height / SMPL reference height)
+    bone_parent_map : {bone_name: parent_bone_name_or_None}
+    ur_pos_by_name  : {bone_name: world_pos[3]}
+    """
+    _gltf, _skin, ibm, node_trs, bone_names, bone_parent_map = _read_glb_skin(rigged_glb)
+    # FIX: glTF IBMs are stored column-major.
+    # numpy reads as row-major, so the stored data is the TRANSPOSE of the actual matrix.
+    # Correct world position = inv(actual_IBM)[:3,3] = inv(ibm[i].T)[:3,3]
+    ur_pos = np.array([
+        np.linalg.inv(ibm[i].T)[:3, 3] for i in range(len(bone_names))
+    ], dtype=np.float32)
+    ur_pos_by_name = {name: ur_pos[i] for i, name in enumerate(bone_names)}
+    # Scale SMPL T-pose to match character height
+    ur_h = ur_pos[:, 1].max() - ur_pos[:, 1].min()
+    sm_h = SMPL_TPOSE[:, 1].max() - SMPL_TPOSE[:, 1].min()
+    h_sc = (ur_h / sm_h) if sm_h > 1e-6 else 1.0
+    sm_pos = SMPL_TPOSE * h_sc
+    # FIX: Normalise ur and smpl Y ranges independently (floor=0, top=1 for each).
+    # The old code used a shared reference which caused floor offsets to misalign.
+    def _norm_independent(pos, own_range_min, own_range_max, x_range, z_range):
+        p = pos.copy().astype(np.float64)
+        y_range = (own_range_max - own_range_min) or 1.0
+        p[:, 0] /= (x_range or 1.0)
+        p[:, 1]  = (p[:, 1] - own_range_min) / y_range
+        p[:, 2] /= (z_range or 1.0)
+        return p
+    # Common X/Z scale (use both skeletons' width for reference)
+    x_range = max(
+        abs(ur_pos[:, 0].max()  - ur_pos[:, 0].min()),
+        abs(sm_pos[:, 0].max()  - sm_pos[:, 0].min()),
+    ) or 1.0
+    z_range = max(
+        abs(ur_pos[:, 2].max()  - ur_pos[:, 2].min()),
+        abs(sm_pos[:, 2].max()  - sm_pos[:, 2].min()),
+    ) or 1.0
+    ur_n = _norm_independent(ur_pos, ur_pos[:, 1].min(), ur_pos[:, 1].max(), x_range, z_range)
+    sm_n = _norm_independent(sm_pos, sm_pos[:, 1].min(), sm_pos[:, 1].max(), x_range, z_range)
+    dist  = np.linalg.norm(ur_n[:, None] - sm_n[None], axis=-1)   # [M, 22]
+    d_sc  = 1.0 - np.clip(dist / (dist.max() + 1e-9), 0, 1)
+    # Name hint score
+    n_sc = np.zeros((len(bone_names), 22), dtype=np.float32)
+    for mi, bname in enumerate(bone_names):
+        stripped = _strip_name(bname)
+        for kws, ji in _NAME_HINTS:
+            if any(kw in stripped for kw in kws):
+                n_sc[mi, ji] = 1.0
+    combined = 0.6 * d_sc + 0.4 * n_sc   # [M, 22]
+    # Greedy assignment
+    THRESHOLD = 0.35
+    pairs = sorted(
+        ((mi, ji, combined[mi, ji])
+         for mi in range(len(bone_names))
+         for ji in range(22)),
+        key=lambda x: -x[2],
+    )
+    bone_to_smpl: dict[str, int] = {}
+    taken: set[int] = set()
+    for mi, ji, score in pairs:
+        if score < THRESHOLD:
+            break
+        bname = bone_names[mi]
+        if bname in bone_to_smpl or ji in taken:
+            continue
+        bone_to_smpl[bname] = ji
+        taken.add(ji)
+    if verbose:
+        n_mapped = len(bone_to_smpl)
+        print(f"\n[MAP] {n_mapped}/{len(bone_names)} bones mapped to SMPL joints:")
+        for bname, ji in sorted(bone_to_smpl.items(), key=lambda x: x[1]):
+            print(f"       {bname:<40} -> {SMPL_NAMES[ji]}")
+        unmapped = [n for n in bone_names if n not in bone_to_smpl]
+        if unmapped:
+            preview = ", ".join(unmapped[:8])
+            print(f"[MAP] {len(unmapped)} unmapped (identity): {preview}"
+                  + (" ..." if len(unmapped) > 8 else ""))
+        print()
+    return bone_to_smpl, node_trs, h_sc, bone_parent_map, ur_pos_by_name
+# ──────────────────────────────────────────────────────────────────────────────
+# T2M forward kinematics (world rotation matrices)
+# ──────────────────────────────────────────────────────────────────────────────
+def _compute_t2m_world_rots(
+    root_rot_wxyz: np.ndarray,      # [4] WXYZ
+    local_rots_wxyz: np.ndarray,    # [21, 4] WXYZ (joints 1-21)
+) -> np.ndarray:
+    """
+    Compute accumulated world rotation matrices for all 22 t2m joints at one frame.
+    Matches skeleton.py's forward_kinematics_cont6d_np: each chain RESETS to R_root.
+    Returns [22, 3, 3] world rotation matrices.
+    """
+    R_root = _quat_to_mat(root_rot_wxyz)
+    world_rots = np.zeros((22, 3, 3), dtype=np.float64)
+    world_rots[0] = R_root
+    for chain in T2M_KINEMATIC_CHAIN:
+        R = R_root.copy()              # always start from R_root (matches skeleton.py)
+        for i in range(1, len(chain)):
+            j = chain[i]
+            R_local = _quat_to_mat(local_rots_wxyz[j - 1])  # j-1: joints 1-21
+            R = R @ R_local
+            world_rots[j] = R
+    return world_rots
+# ──────────────────────────────────────────────────────────────────────────────
+# Keyframe builder — world-direction matching
+# ──────────────────────────────────────────────────────────────────────────────
+def build_keyframes(
+    motion:          SMPLMotion,
+    bone_to_smpl:    dict[str, int],
+    node_trs:        dict[str, tuple],
+    height_scale:    float,
+    bone_parent_map: dict[str, str | None],
+    ur_pos_by_name:  dict[str, np.ndarray],
+) -> list[dict]:
+    """
+    Convert SMPLMotion -> List[Dict[bone_name -> (loc, rot_delta, scale)]]
+    using world-direction matching retargeting.
+    """
+    T      = motion.num_frames
+    zeros3 = np.zeros(3, dtype=np.float32)
+    ones3  = np.ones(3,  dtype=np.float32)
+    # Topological order: root joints (si==0) first, then by SMPL joint index
+    # (parents always have lower SMPL indices in the kinematic chain)
+    sorted_bones = sorted(bone_to_smpl.keys(), key=lambda b: bone_to_smpl[b])
+    keyframes: list[dict] = []
+    for ti in range(T):
+        frame: dict = {}
+        # T2M world rotation matrices for this frame
+        world_rots_t2m = _compute_t2m_world_rots(
+            motion.root_rot[ti].astype(np.float64),
+            motion.local_rot[ti].astype(np.float64),
+        )
+        # Track UniRig world rotations per bone (needed for child local rotations)
+        world_rot_ur: dict[str, np.ndarray] = {}
+        for bname in sorted_bones:
+            si = bone_to_smpl[bname]
+            rest_t, rest_r, _rest_s = node_trs[bname]
+            rest_t = rest_t.astype(np.float32)
+            rest_r_mat = _quat_to_mat(rest_r)
+            # ── Root bone (si == 0): drive world translation + facing rotation ──
+            if si == 0:
+                world_pos = motion.root_pos[ti].astype(np.float64) * height_scale
+                pose_loc  = (world_pos - rest_t.astype(np.float64)).astype(np.float32)
+                # Root world rotation = t2m root rotation (Y-axis only)
+                R_world_root = _quat_to_mat(motion.root_rot[ti])
+                world_rot_ur[bname] = R_world_root
+                # pose_rot_delta = inv(rest_r) @ target_world_rot
+                pose_rot_mat = rest_r_mat.T @ R_world_root
+                pose_rot     = _mat_to_quat(pose_rot_mat)
+                frame[bname] = (pose_loc, pose_rot, ones3)
+                continue
+            # ── Non-root bone: world-direction matching ──────────────────────
+            # T2M world bone direction (in t2m coordinate frame)
+            raw_dir_t2m = world_rots_t2m[si] @ T2M_RAW_OFFSETS[si]  # [3]
+            # COORDINATE FRAME CONVERSION: t2m +X = character LEFT; SMPL +X = character RIGHT
+            # Flip X to convert t2m world directions -> SMPL/UniRig world directions
+            d_desired = np.array([-raw_dir_t2m[0], raw_dir_t2m[1], raw_dir_t2m[2]])
+            d_desired_norm = d_desired / (np.linalg.norm(d_desired) + 1e-12)
+            # UniRig rest bone direction (from inverse bind matrices, world space)
+            parent_b = bone_parent_map.get(bname)
+            if parent_b and parent_b in ur_pos_by_name:
+                d_rest = (ur_pos_by_name[bname] - ur_pos_by_name[parent_b]).astype(np.float64)
+            else:
+                d_rest = ur_pos_by_name[bname].astype(np.float64)
+            d_rest_norm = d_rest / (np.linalg.norm(d_rest) + 1e-12)
+            # Minimal world-space rotation: rest direction -> desired direction
+            R_world_desired = _r_between(d_rest_norm, d_desired_norm)  # [3, 3]
+            world_rot_ur[bname] = R_world_desired
+            # Local rotation = inv(parent_world) @ R_world_desired
+            if parent_b and parent_b in world_rot_ur:
+                R_parent = world_rot_ur[parent_b]
+            else:
+                R_parent = _ID_MAT3
+            local_rot_mat = R_parent.T @ R_world_desired   # R_parent^-1 @ R_world
+            # pose_rot_delta = inv(rest_r) @ local_rot
+            # (glTF applies: final = rest_r @ pose_rot_delta = local_rot)
+            pose_rot_mat = rest_r_mat.T @ local_rot_mat
+            pose_rot     = _mat_to_quat(pose_rot_mat)
+            frame[bname] = (zeros3, pose_rot, ones3)
+        keyframes.append(frame)
+    return keyframes
+# ──────────────────────────────────────────────────────────────────────────────
+# Public API
+# ��─────────────────────────────────────────────────────────────────────────────
+def animate_glb(
+    motion:      Union[np.ndarray, list, SMPLMotion],
+    rigged_glb:  str,
+    output_glb:  str,
+    fps:         float = 20.0,
+    start_frame: int   = 0,
+    num_frames:  int   = -1,
+) -> str:
+    """
+    Bake a HumanML3D motion clip onto a UniRig-rigged GLB.
+    Parameters
+    ----------
+    motion       : [T, 263] ndarray, list, or pre-parsed SMPLMotion
+    rigged_glb   : path to UniRig merge output (.glb with a skin)
+    output_glb   : destination path for animated GLB
+    fps          : frame rate embedded in the animation track
+    start_frame / num_frames : optional clip range (-1 = all frames)
+    Returns str absolute path to output_glb.
+    """
+    from .io.gltf_io import write_gltf_animation
+    # 1. Parse motion
+    if isinstance(motion, SMPLMotion):
+        smpl = motion
+    else:
+        data = np.asarray(motion, dtype=np.float32)
+        if data.ndim != 2 or data.shape[1] < 193:
+            raise ValueError(f"Expected [T, 263] HumanML3D features, got {data.shape}")
+        smpl = hml3d_to_smpl_motion(data, fps=fps)
+    # 2. Slice
+    end  = (start_frame + num_frames) if num_frames > 0 else smpl.num_frames
+    smpl = smpl.slice(start_frame, end)
+    print(f"[animate] {smpl.num_frames} frames @ {fps:.0f} fps  ->  {output_glb}")
+    # 3. Build bone map (now returns parent map and world positions too)
+    bone_to_smpl, node_trs, h_sc, bone_parent_map, ur_pos_by_name = \
+        build_bone_map(rigged_glb, verbose=True)
+    if not bone_to_smpl:
+        raise RuntimeError(
+            "build_bone_map returned 0 matches. "
+            "Ensure the GLB has a valid skin with readable inverse bind matrices."
+        )
+    # 4. Build keyframes using world-direction matching
+    keyframes = build_keyframes(smpl, bone_to_smpl, node_trs, h_sc,
+                                bone_parent_map, ur_pos_by_name)
+    # 5. Write GLB
+    out_dir = os.path.dirname(os.path.abspath(output_glb))
+    if out_dir:
+        os.makedirs(out_dir, exist_ok=True)
+    write_gltf_animation(
+        source_filepath=rigged_glb,
+        dest_armature=None,
+        keyframes=keyframes,
+        output_filepath=output_glb,
+        fps=float(fps),
+    )
+    return output_glb
+# Backwards-compatibility alias
+def animate_glb_from_hml3d(
+    motion, rigged_glb, output_glb, fps=20, start_frame=0, num_frames=-1
+):
+    return animate_glb(
+        motion, rigged_glb, output_glb,
+        fps=fps, start_frame=start_frame, num_frames=num_frames,
+    )

Retarget/cli.py ADDED Viewed

	@@ -0,0 +1,129 @@

+"""
+cli.py
+Command-line interface for rig_retarget.
+Usage:
+    python -m rig_retarget.cli \\
+        --source  walk.bvh \\
+        --dest    unirig_character.glb \\
+        --mapping radical2unirig.json \\
+        --output  animated_character.glb \\
+        [--fps 30] [--start 0] [--frames 100] [--step 1]
+    # Calculate corrections only (no transfer):
+    python -m rig_retarget.cli --calc-corrections \\
+        --source walk.bvh --dest unirig_character.glb \\
+        --mapping mymap.json
+"""
+from __future__ import annotations
+import argparse
+import sys
+from pathlib import Path
+def _parse_args(argv=None):
+    p = argparse.ArgumentParser(
+        prog="rig_retarget",
+        description="Retarget animation from BVH/glTF source onto UniRig/glTF destination.",
+    )
+    p.add_argument("--source", required=True, help="Source animation file (.bvh or .glb/.gltf)")
+    p.add_argument("--dest", required=True, help="Destination skeleton file (.glb/.gltf, UniRig output)")
+    p.add_argument("--mapping", required=True, help="KeeMap-compatible JSON bone mapping file")
+    p.add_argument("--output", default=None, help="Output animated .glb (default: dest_retargeted.glb)")
+    p.add_argument("--fps", type=float, default=30.0)
+    p.add_argument("--start", type=int, default=0, help="Start frame index (0-based)")
+    p.add_argument("--frames", type=int, default=None, help="Number of frames to transfer (default: all)")
+    p.add_argument("--step", type=int, default=1, help="Keyframe every N source frames")
+    p.add_argument("--skin", type=int, default=0, help="Skin index in destination glTF")
+    p.add_argument("--calc-corrections", action="store_true",
+                   help="Auto-calculate bone corrections and update the mapping JSON, then exit.")
+    p.add_argument("--verbose", action="store_true")
+    return p.parse_args(argv)
+def main(argv=None) -> None:
+    args = _parse_args(argv)
+    from .io.mapping import load_mapping, save_mapping, KeeMapSettings
+    from .io.gltf_io import load_gltf, write_gltf_animation
+    from .retarget import (
+        calc_all_corrections, transfer_animation,
+    )
+    # -----------------------------------------------------------------------
+    # Load mapping
+    # -----------------------------------------------------------------------
+    print(f"[*] Loading mapping  : {args.mapping}")
+    settings, bone_items = load_mapping(args.mapping)
+    # Override settings from CLI args
+    settings.start_frame_to_apply = args.start
+    settings.keyframe_every_n_frames = args.step
+    # -----------------------------------------------------------------------
+    # Load source animation
+    # -----------------------------------------------------------------------
+    src_path = Path(args.source)
+    print(f"[*] Loading source   : {src_path}")
+    if src_path.suffix.lower() == ".bvh":
+        from .io.bvh import load_bvh
+        src_anim = load_bvh(str(src_path))
+        if args.verbose:
+            print(f"    BVH: {src_anim.num_frames} frames, "
+                  f"{src_anim.frame_time*1000:.1f} ms/frame, "
+                  f"{len(src_anim.armature.pose_bones)} joints")
+    elif src_path.suffix.lower() in (".glb", ".gltf"):
+        # glTF source — load skeleton only; animation reading is TODO
+        raise NotImplementedError(
+            "glTF source animation reading is not yet implemented. "
+            "Use a BVH file for the source animation."
+        )
+    else:
+        print(f"[!] Unsupported source format: {src_path.suffix}", file=sys.stderr)
+        sys.exit(1)
+    if args.frames is not None:
+        settings.number_of_frames_to_apply = args.frames
+    else:
+        settings.number_of_frames_to_apply = src_anim.num_frames - args.start
+    # -----------------------------------------------------------------------
+    # Load destination skeleton
+    # -----------------------------------------------------------------------
+    dst_path = Path(args.dest)
+    print(f"[*] Loading dest     : {dst_path}")
+    dst_arm = load_gltf(str(dst_path), skin_index=args.skin)
+    if args.verbose:
+        print(f"    Skeleton: {len(dst_arm.pose_bones)} bones")
+    # -----------------------------------------------------------------------
+    # Auto-correct pass (optional)
+    # -----------------------------------------------------------------------
+    if args.calc_corrections:
+        print("[*] Calculating bone corrections ...")
+        src_anim.apply_frame(args.start)
+        calc_all_corrections(bone_items, src_anim.armature, dst_arm, settings)
+        save_mapping(args.mapping, settings, bone_items)
+        print(f"[*] Updated mapping saved → {args.mapping}")
+        return
+    # -----------------------------------------------------------------------
+    # Transfer
+    # -----------------------------------------------------------------------
+    print(f"[*] Transferring {settings.number_of_frames_to_apply} frames "
+          f"(start={settings.start_frame_to_apply}, step={settings.keyframe_every_n_frames}) ...")
+    keyframes = transfer_animation(src_anim, dst_arm, bone_items, settings)
+    print(f"[*] Generated {len(keyframes)} keyframes")
+    # -----------------------------------------------------------------------
+    # Write output
+    # -----------------------------------------------------------------------
+    out_path = args.output or str(dst_path.with_name(dst_path.stem + "_retargeted.glb"))
+    print(f"[*] Writing output   : {out_path}")
+    write_gltf_animation(str(dst_path), dst_arm, keyframes, out_path, fps=args.fps, skin_index=args.skin)
+    print("[✓] Done")
+if __name__ == "__main__":
+    main()

Retarget/generate.py ADDED Viewed

	@@ -0,0 +1,131 @@

+"""
+generate.py
+───────────────────────────────────────────────────────────────────────────────
+Text-to-motion generation.
+Primary backend:  MoMask inference server running on the Vast.ai instance.
+                  Returns [T, 263] HumanML3D features directly — no SMPL
+                  body mesh required.
+Fallback backend: HumanML3D dataset keyword search (offline / no GPU needed).
+Usage
+─────
+    from Retarget.generate import generate_motion
+    # Use MoMask on instance
+    motion = generate_motion("a person walks forward",
+                             backend_url="http://ssh4.vast.ai:8765")
+    # Local fallback (streams HuggingFace dataset)
+    motion = generate_motion("a person walks forward")
+    # Returned motion: np.ndarray [T, 263]
+    # Feed directly to animate_glb()
+"""
+from __future__ import annotations
+import json
+import numpy as np
+# ──────────────────────────────────────────────────────────────────────────────
+# Public API
+# ──────────────────────────────────────────────────────────────────────────────
+def generate_motion(
+    prompt:      str,
+    backend_url: str | None = None,
+    num_frames:  int   = 196,
+    fps:         float = 20.0,
+    seed:        int   = -1,
+) -> np.ndarray:
+    """
+    Generate a HumanML3D [T, 263] motion array from a text prompt.
+    Parameters
+    ----------
+    prompt
+        Natural language description of the desired motion.
+        Examples: "a person walks forward", "someone does a jumping jack",
+                  "a man waves hello with his right hand"
+    backend_url
+        URL of the MoMask inference server.  E.g. "http://ssh4.vast.ai:8765".
+        If None or if the server is unreachable, falls back to dataset search.
+    num_frames
+        Desired clip length in frames (at 20 fps; max ~196 ≈ 9.8 s).
+    fps
+        Target fps (MoMask natively produces 20 fps).
+    seed
+        Random seed for reproducibility (-1 = random).
+    Returns
+    -------
+    np.ndarray  shape [T, 263]  HumanML3D feature vector.
+    """
+    if backend_url:
+        try:
+            return _call_momask(prompt, backend_url, num_frames, seed)
+        except Exception as exc:
+            print(f"[generate] MoMask unreachable ({exc}) — falling back to dataset search")
+    return _dataset_search_fallback(prompt)
+# ──────────────────────────────────────────────────────────────────────────────
+# MoMask backend
+# ──────────────────────────────────────────────────────────────────────────────
+def _call_momask(
+    prompt:     str,
+    url:        str,
+    num_frames: int,
+    seed:       int,
+) -> np.ndarray:
+    """POST to the MoMask inference server; return [T, 263] array."""
+    import urllib.request
+    payload = json.dumps({
+        "prompt":     prompt,
+        "num_frames": num_frames,
+        "seed":       seed,
+    }).encode("utf-8")
+    req = urllib.request.Request(
+        f"{url.rstrip('/')}/generate",
+        data=payload,
+        headers={"Content-Type": "application/json"},
+        method="POST",
+    )
+    with urllib.request.urlopen(req, timeout=180) as resp:
+        result = json.loads(resp.read())
+    motion = np.array(result["motion"], dtype=np.float32)
+    if motion.ndim != 2 or motion.shape[1] < 193:
+        raise ValueError(f"Server returned unexpected shape {motion.shape}")
+    print(f"[generate] MoMask: {motion.shape[0]} frames for '{prompt}'")
+    return motion
+# ──────────────────────────────────────────────────────────────────────────────
+# Dataset search fallback
+# ──────────────────────────────────────────────────────────────────────────────
+def _dataset_search_fallback(prompt: str) -> np.ndarray:
+    """
+    Keyword search in TeoGchx/HumanML3D dataset (streaming, HuggingFace).
+    Used when no MoMask server is available.
+    """
+    from .search import search_motions, format_choice_label
+    print(f"[generate] Searching HumanML3D dataset for: '{prompt}'")
+    results = search_motions(prompt, top_k=5, split="test", max_scan=500)
+    if not results:
+        raise RuntimeError(
+            f"No motion found in dataset for prompt: {prompt!r}\n"
+            "Check your internet connection or deploy MoMask on the instance."
+        )
+    best = results[0]
+    print(f"[generate] Best match: {format_choice_label(best)}")
+    return np.array(best["motion"], dtype=np.float32)

Retarget/humanml3d_to_bvh.py ADDED Viewed

	@@ -0,0 +1,813 @@

+#!/usr/bin/env python3
+"""
+humanml3d_to_bvh.py
+Convert HumanML3D .npy motion files → BVH animation.
+When a UniRig-rigged GLB (or ASCII FBX) is supplied via --rig, the BVH is
+built using the UniRig skeleton's own bone names and hierarchy, with
+automatic bone-to-SMPL-joint mapping — no Blender required.
+Dependencies
+  numpy           (always required)
+  pygltflib       pip install pygltflib   (required for --rig GLB files)
+Usage
+  # SMPL-named BVH (no rig needed)
+  python humanml3d_to_bvh.py 000001.npy
+  # Retargeted to UniRig skeleton
+  python humanml3d_to_bvh.py 000001.npy --rig rigged_mesh.glb
+  # Explicit output + fps
+  python humanml3d_to_bvh.py 000001.npy --rig rigged_mesh.glb -o anim.bvh --fps 20
+"""
+from __future__ import annotations
+import argparse, re, sys
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Optional
+import numpy as np
+# ══════════════════════════════════════════════════════════════════════════════
+# SMPL 22-joint skeleton definition
+# ══════════════════════════════════════════════════════════════════════════════
+SMPL_NAMES = [
+    "Hips",          # 0  pelvis / root
+    "LeftUpLeg",     # 1  left_hip
+    "RightUpLeg",    # 2  right_hip
+    "Spine",         # 3  spine1
+    "LeftLeg",       # 4  left_knee
+    "RightLeg",      # 5  right_knee
+    "Spine1",        # 6  spine2
+    "LeftFoot",      # 7  left_ankle
+    "RightFoot",     # 8  right_ankle
+    "Spine2",        # 9  spine3
+    "LeftToeBase",   # 10 left_foot
+    "RightToeBase",  # 11 right_foot
+    "Neck",          # 12 neck
+    "LeftShoulder",  # 13 left_collar
+    "RightShoulder", # 14 right_collar
+    "Head",          # 15 head
+    "LeftArm",       # 16 left_shoulder
+    "RightArm",      # 17 right_shoulder
+    "LeftForeArm",   # 18 left_elbow
+    "RightForeArm",  # 19 right_elbow
+    "LeftHand",      # 20 left_wrist
+    "RightHand",     # 21 right_wrist
+]
+SMPL_PARENT = [-1, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 9, 9, 12, 13, 14, 16, 17, 18, 19]
+NUM_SMPL = 22
+SMPL_TPOSE = np.array([
+    [ 0.000,  0.920,  0.000],  # 0  Hips
+    [-0.095,  0.920,  0.000],  # 1  LeftUpLeg
+    [ 0.095,  0.920,  0.000],  # 2  RightUpLeg
+    [ 0.000,  0.980,  0.000],  # 3  Spine
+    [-0.095,  0.495,  0.000],  # 4  LeftLeg
+    [ 0.095,  0.495,  0.000],  # 5  RightLeg
+    [ 0.000,  1.050,  0.000],  # 6  Spine1
+    [-0.095,  0.075,  0.000],  # 7  LeftFoot
+    [ 0.095,  0.075,  0.000],  # 8  RightFoot
+    [ 0.000,  1.120,  0.000],  # 9  Spine2
+    [-0.095,  0.000, -0.020],  # 10 LeftToeBase
+    [ 0.095,  0.000, -0.020],  # 11 RightToeBase
+    [ 0.000,  1.370,  0.000],  # 12 Neck
+    [-0.130,  1.290,  0.000],  # 13 LeftShoulder
+    [ 0.130,  1.290,  0.000],  # 14 RightShoulder
+    [ 0.000,  1.500,  0.000],  # 15 Head
+    [-0.330,  1.290,  0.000],  # 16 LeftArm
+    [ 0.330,  1.290,  0.000],  # 17 RightArm
+    [-0.630,  1.290,  0.000],  # 18 LeftForeArm
+    [ 0.630,  1.290,  0.000],  # 19 RightForeArm
+    [-0.910,  1.290,  0.000],  # 20 LeftHand
+    [ 0.910,  1.290,  0.000],  # 21 RightHand
+], dtype=np.float32)
+_SMPL_CHILDREN: list[list[int]] = [[] for _ in range(NUM_SMPL)]
+for _j, _p in enumerate(SMPL_PARENT):
+    if _p >= 0:
+        _SMPL_CHILDREN[_p].append(_j)
+def _smpl_dfs() -> list[int]:
+    order, stack = [], [0]
+    while stack:
+        j = stack.pop()
+        order.append(j)
+        for c in reversed(_SMPL_CHILDREN[j]):
+            stack.append(c)
+    return order
+SMPL_DFS = _smpl_dfs()
+# ══════════════════════════════════════════════════════════════════════════════
+# Quaternion helpers  (numpy, WXYZ)
+# ══════════════════════════════════════════════════════════════════════════════
+def qnorm(q: np.ndarray) -> np.ndarray:
+    return q / (np.linalg.norm(q, axis=-1, keepdims=True) + 1e-9)
+def qmul(a: np.ndarray, b: np.ndarray) -> np.ndarray:
+    aw, ax, ay, az = a[..., 0], a[..., 1], a[..., 2], a[..., 3]
+    bw, bx, by, bz = b[..., 0], b[..., 1], b[..., 2], b[..., 3]
+    return np.stack([
+        aw*bw - ax*bx - ay*by - az*bz,
+        aw*bx + ax*bw + ay*bz - az*by,
+        aw*by - ax*bz + ay*bw + az*bx,
+        aw*bz + ax*by - ay*bx + az*bw,
+    ], axis=-1)
+def qinv(q: np.ndarray) -> np.ndarray:
+    return q * np.array([1, -1, -1, -1], dtype=np.float32)
+def qrot(q: np.ndarray, v: np.ndarray) -> np.ndarray:
+    vq = np.concatenate([np.zeros((*v.shape[:-1], 1), dtype=v.dtype), v], axis=-1)
+    return qmul(qmul(q, vq), qinv(q))[..., 1:]
+def qbetween(a: np.ndarray, b: np.ndarray) -> np.ndarray:
+    """Swing quaternion rotating unit-vectors a to b.  [..., 3] to [..., 4]."""
+    a = a / (np.linalg.norm(a, axis=-1, keepdims=True) + 1e-9)
+    b = b / (np.linalg.norm(b, axis=-1, keepdims=True) + 1e-9)
+    dot   = np.clip((a * b).sum(axis=-1, keepdims=True), -1.0, 1.0)
+    cross = np.cross(a, b)
+    w     = np.sqrt(np.maximum((1.0 + dot) * 0.5, 0.0))
+    xyz   = cross / (2.0 * w + 1e-9)
+    anti  = (dot[..., 0] < -0.9999)
+    if anti.any():
+        perp = np.where(
+            np.abs(a[anti, 0:1]) < 0.9,
+            np.tile([1, 0, 0], (anti.sum(), 1)),
+            np.tile([0, 1, 0], (anti.sum(), 1)),
+        ).astype(np.float32)
+        ax_f       = np.cross(a[anti], perp)
+        ax_f       = ax_f / (np.linalg.norm(ax_f, axis=-1, keepdims=True) + 1e-9)
+        w[anti]    = 0.0
+        xyz[anti]  = ax_f
+    return qnorm(np.concatenate([w, xyz], axis=-1))
+def quat_to_euler_ZXY(q: np.ndarray) -> np.ndarray:
+    """WXYZ quaternions to ZXY Euler degrees (rz, rx, ry) for BVH."""
+    w, x, y, z = q[..., 0], q[..., 1], q[..., 2], q[..., 3]
+    sin_x = np.clip(2.0*(w*x - y*z), -1.0, 1.0)
+    return np.stack([
+        np.degrees(np.arctan2(2.0*(w*z + x*y), 1.0 - 2.0*(x*x + z*z))),
+        np.degrees(np.arcsin(sin_x)),
+        np.degrees(np.arctan2(2.0*(w*y + x*z), 1.0 - 2.0*(x*x + y*y))),
+    ], axis=-1)
+# ══════════════════════════════════════════════════════════════════════════════
+# HumanML3D 263-dim recovery
+#
+# Layout per frame:
+#   [0]       root Y-axis angular velocity (rad/frame)
+#   [1]       root height Y (m)
+#   [2:4]     root XZ velocity in local frame
+#   [4:67]    local positions of joints 1-21  (21 x 3 = 63)
+#   [67:193]  6-D rotations for joints 1-21  (21 x 6 = 126, unused here)
+#   [193:259] joint velocities               (22 x 3 = 66,  unused here)
+#   [259:263] foot contact                   (4,             unused here)
+# ══════════════════════════════════════════════════════════════════════════════
+def _recover_root(data: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
+    T     = data.shape[0]
+    theta = np.cumsum(data[:, 0])
+    half  = theta * 0.5
+    r_rot = np.zeros((T, 4), dtype=np.float32)
+    r_rot[:, 0] = np.cos(half)   # W
+    r_rot[:, 2] = np.sin(half)   # Y
+    vel_local = np.stack([data[:, 2], np.zeros(T, dtype=np.float32), data[:, 3]], -1)
+    vel_world = qrot(r_rot, vel_local)
+    r_pos = np.zeros((T, 3), dtype=np.float32)
+    r_pos[:, 0] = np.cumsum(vel_world[:, 0])
+    r_pos[:, 1] = data[:, 1]
+    r_pos[:, 2] = np.cumsum(vel_world[:, 2])
+    return r_rot, r_pos
+def recover_from_ric(data: np.ndarray, joints_num: int = 22) -> np.ndarray:
+    """263-dim features to world-space positions [T, joints_num, 3]."""
+    data = data.astype(np.float32)
+    r_rot, r_pos = _recover_root(data)
+    loc  = data[:, 4:4 + (joints_num-1)*3].reshape(-1, joints_num-1, 3)
+    rinv = np.broadcast_to(qinv(r_rot)[:, None], (*loc.shape[:2], 4)).copy()
+    wloc = qrot(rinv, loc) + r_pos[:, None]
+    return np.concatenate([r_pos[:, None], wloc], axis=1)
+# ══════════════════════════════════════════════════════════════════════════════
+# SMPL geometry helpers
+# ══════════════════════════════════════════════════════════════════════════════
+def _scale_smpl_tpose(positions: np.ndarray) -> np.ndarray:
+    data_h = positions[:, :, 1].max() - positions[:, :, 1].min()
+    ref_h  = SMPL_TPOSE[:, 1].max() - SMPL_TPOSE[:, 1].min()
+    scale  = (data_h / ref_h) if (ref_h > 1e-6 and data_h > 1e-6) else 1.0
+    return SMPL_TPOSE * scale
+def _rest_dirs(tpose: np.ndarray, children: list[list[int]],
+               parent: list[int]) -> np.ndarray:
+    N = tpose.shape[0]
+    dirs = np.zeros((N, 3), dtype=np.float32)
+    for j in range(N):
+        ch = children[j]
+        if ch:
+            avg  = np.stack([tpose[c] - tpose[j] for c in ch]).mean(0)
+            dirs[j] = avg / (np.linalg.norm(avg) + 1e-9)
+        else:
+            v = tpose[j] - tpose[parent[j]]
+            dirs[j] = v / (np.linalg.norm(v) + 1e-9)
+    return dirs
+def positions_to_local_quats(positions: np.ndarray,
+                              tpose: np.ndarray) -> np.ndarray:
+    """World-space joint positions [T, 22, 3] to local quaternions [T, 22, 4]."""
+    T  = positions.shape[0]
+    rd = _rest_dirs(tpose, _SMPL_CHILDREN, SMPL_PARENT)
+    world_q = np.zeros((T, NUM_SMPL, 4), dtype=np.float32)
+    world_q[:, :, 0] = 1.0
+    for j in range(NUM_SMPL):
+        ch = _SMPL_CHILDREN[j]
+        if ch:
+            vecs = np.stack([positions[:, c] - positions[:, j] for c in ch], 1).mean(1)
+        else:
+            vecs = positions[:, j] - positions[:, SMPL_PARENT[j]]
+        cur     = vecs / (np.linalg.norm(vecs, axis=-1, keepdims=True) + 1e-9)
+        rd_b    = np.broadcast_to(rd[j], cur.shape).copy()
+        world_q[:, j] = qbetween(rd_b, cur)
+    local_q = np.zeros_like(world_q)
+    local_q[:, :, 0] = 1.0
+    for j in SMPL_DFS:
+        p = SMPL_PARENT[j]
+        if p < 0:
+            local_q[:, j] = world_q[:, j]
+        else:
+            local_q[:, j] = qmul(qinv(world_q[:, p]), world_q[:, j])
+    return qnorm(local_q)
+# ══════════════════════════════════════════════════════════════════════════════
+# UniRig skeleton data structure
+# ══════════════════════════════════════════════════════════════════════════════
+@dataclass
+class Bone:
+    name:           str
+    parent:         Optional[str]
+    world_rest_pos: np.ndarray
+    children:       list[str] = field(default_factory=list)
+    smpl_idx:       Optional[int] = None
+class UnirigSkeleton:
+    def __init__(self, bones: dict[str, Bone]):
+        self.bones = bones
+        self.root  = next(b for b in bones.values() if b.parent is None)
+    def dfs_order(self) -> list[str]:
+        order, stack = [], [self.root.name]
+        while stack:
+            n = stack.pop()
+            order.append(n)
+            for c in reversed(self.bones[n].children):
+                stack.append(c)
+        return order
+    def local_offsets(self) -> dict[str, np.ndarray]:
+        offsets = {}
+        for name, bone in self.bones.items():
+            if bone.parent is None:
+                offsets[name] = bone.world_rest_pos.copy()
+            else:
+                offsets[name] = bone.world_rest_pos - self.bones[bone.parent].world_rest_pos
+        return offsets
+    def rest_direction(self, name: str) -> np.ndarray:
+        bone = self.bones[name]
+        if bone.children:
+            vecs = np.stack([self.bones[c].world_rest_pos - bone.world_rest_pos
+                             for c in bone.children])
+            avg  = vecs.mean(0)
+            return avg / (np.linalg.norm(avg) + 1e-9)
+        if bone.parent is None:
+            return np.array([0, 1, 0], dtype=np.float32)
+        v = bone.world_rest_pos - self.bones[bone.parent].world_rest_pos
+        return v / (np.linalg.norm(v) + 1e-9)
+# ══════════════════════════════════════════════════════════════════════════════
+# GLB skeleton parser
+# ══════════════════════════════════════════════════════════════════════════════
+def parse_glb_skeleton(path: str) -> UnirigSkeleton:
+    """Extract skeleton from a UniRig-rigged GLB (uses pygltflib)."""
+    try:
+        import pygltflib
+    except ImportError:
+        sys.exit("[ERROR] pygltflib not installed.  pip install pygltflib")
+    import base64
+    gltf = pygltflib.GLTF2().load(path)
+    if not gltf.skins:
+        sys.exit(f"[ERROR] No skin found in {path}")
+    skin          = gltf.skins[0]
+    joint_indices = skin.joints
+    def get_buffer_bytes(buf_idx: int) -> bytes:
+        buf = gltf.buffers[buf_idx]
+        if buf.uri is None:
+            return bytes(gltf.binary_blob())
+        if buf.uri.startswith("data:"):
+            return base64.b64decode(buf.uri.split(",", 1)[1])
+        return (Path(path).parent / buf.uri).read_bytes()
+    def read_accessor(acc_idx: int) -> np.ndarray:
+        acc  = gltf.accessors[acc_idx]
+        bv   = gltf.bufferViews[acc.bufferView]
+        raw  = get_buffer_bytes(bv.buffer)
+        COMP = {5120: ('b',1), 5121: ('B',1), 5122: ('h',2),
+                5123: ('H',2), 5125: ('I',4), 5126: ('f',4)}
+        DIMS = {"SCALAR":1,"VEC2":2,"VEC3":3,"VEC4":4,"MAT2":4,"MAT3":9,"MAT4":16}
+        fmt, sz = COMP[acc.componentType]
+        dim     = DIMS[acc.type]
+        start   = (bv.byteOffset or 0) + (acc.byteOffset or 0)
+        stride  = bv.byteStride
+        if stride is None or stride == 0 or stride == sz * dim:
+            chunk = raw[start: start + acc.count * sz * dim]
+            return np.frombuffer(chunk, dtype=fmt).reshape(acc.count, dim).astype(np.float32)
+        rows = []
+        for i in range(acc.count):
+            off = start + i * stride
+            rows.append(np.frombuffer(raw[off: off + sz * dim], dtype=fmt))
+        return np.stack(rows).astype(np.float32)
+    ibm     = read_accessor(skin.inverseBindMatrices).reshape(-1, 4, 4)
+    joint_set = set(joint_indices)
+    ni_name   = {ni: (gltf.nodes[ni].name or f"bone_{ni}") for ni in joint_indices}
+    bones: dict[str, Bone] = {}
+    for i, ni in enumerate(joint_indices):
+        name      = ni_name[ni]
+        world_mat = np.linalg.inv(ibm[i])
+        bones[name] = Bone(name=name, parent=None,
+                           world_rest_pos=world_mat[:3, 3].astype(np.float32))
+    for ni in joint_indices:
+        for ci in (gltf.nodes[ni].children or []):
+            if ci in joint_set:
+                p, c = ni_name[ni], ni_name[ci]
+                bones[c].parent = p
+                bones[p].children.append(c)
+    print(f"[GLB] {len(bones)} bones from skin '{gltf.skins[0].name or 'Armature'}'")
+    return UnirigSkeleton(bones)
+# ══════════════════════════════════════════════════════════════════════════════
+# ASCII FBX skeleton parser
+# ══════════════════════════════════════════════════════════════════════════════
+def parse_fbx_ascii_skeleton(path: str) -> UnirigSkeleton:
+    """Parse ASCII-format FBX for LimbNode / Root bones."""
+    raw = Path(path).read_bytes()
+    if raw[:4] == b"Kayd":
+        sys.exit(
+            f"[ERROR] {path} is binary FBX.\n"
+            "Convert to GLB first, e.g.:\n"
+            "  gltf-pipeline -i rigged.fbx -o rigged.glb"
+        )
+    text = raw.decode("utf-8", errors="replace")
+    model_pat = re.compile(
+        r'Model:\s*(\d+),\s*"Model::([^"]+)",\s*"(LimbNode|Root|Null)"'
+        r'.*?Properties70:\s*\{(.*?)\}',
+        re.DOTALL
+    )
+    trans_pat = re.compile(
+        r'P:\s*"Lcl Translation".*?(-?[\d.e+\-]+),\s*(-?[\d.e+\-]+),\s*(-?[\d.e+\-]+)'
+    )
+    uid_name:  dict[str, str]        = {}
+    uid_local: dict[str, np.ndarray] = {}
+    for m in model_pat.finditer(text):
+        uid, name = m.group(1), m.group(2)
+        uid_name[uid] = name
+        tm = trans_pat.search(m.group(4))
+        uid_local[uid] = (np.array([float(tm.group(i)) for i in (1,2,3)], dtype=np.float32)
+                          if tm else np.zeros(3, dtype=np.float32))
+    if not uid_name:
+        sys.exit("[ERROR] No LimbNode/Root bones found in FBX")
+    conn_pat    = re.compile(r'C:\s*"OO",\s*(\d+),\s*(\d+)')
+    uid_parent: dict[str, str] = {}
+    for m in conn_pat.finditer(text):
+        child, par = m.group(1), m.group(2)
+        if child in uid_name and par in uid_name:
+            uid_parent[child] = par
+    # Detect cm vs m
+    all_y = np.array([t[1] for t in uid_local.values()])
+    scale = 0.01 if all_y.max() > 10.0 else 1.0
+    if scale != 1.0:
+        print(f"[FBX] Centimetre units detected — scaling by {scale}")
+    for uid in uid_local:
+        uid_local[uid] *= scale
+    # Accumulate world translations (topological order)
+    def topo(uid_to_par):
+        visited, order = set(), []
+        def visit(u):
+            if u in visited: return
+            visited.add(u)
+            if u in uid_to_par: visit(uid_to_par[u])
+            order.append(u)
+        for u in uid_to_par: visit(u)
+        for u in uid_name:
+            if u not in visited: order.append(u)
+        return order
+    world: dict[str, np.ndarray] = {}
+    for uid in topo(uid_parent):
+        loc = uid_local.get(uid, np.zeros(3, dtype=np.float32))
+        world[uid] = (world.get(uid_parent[uid], np.zeros(3, dtype=np.float32)) + loc
+                      if uid in uid_parent else loc.copy())
+    bones: dict[str, Bone] = {}
+    for uid, name in uid_name.items():
+        bones[name] = Bone(name=name, parent=None, world_rest_pos=world[uid])
+    for uid, p_uid in uid_parent.items():
+        c, p = uid_name[uid], uid_name[p_uid]
+        bones[c].parent = p
+        if c not in bones[p].children:
+            bones[p].children.append(c)
+    print(f"[FBX] {len(bones)} bones parsed from ASCII FBX")
+    return UnirigSkeleton(bones)
+# ══════════════════════════════════════════════════════════════════════════════
+# Auto bone mapping: UniRig bones to SMPL joints
+# ══════════════════════════════════════════════════════════════════════════════
+# Keyword table: normalised name fragments -> SMPL joint index
+_NAME_HINTS: list[tuple[list[str], int]] = [
+    (["hips","pelvis","root","hip"],                                               0),
+    (["leftupleg","l_upleg","lupleg","leftthigh","lefthip",
+      "left_upper_leg","l_thigh","thigh_l","upperleg_l","j_bip_l_upperleg"],       1),
+    (["rightupleg","r_upleg","rupleg","rightthigh","righthip",
+      "right_upper_leg","r_thigh","thigh_r","upperleg_r","j_bip_r_upperleg"],      2),
+    (["spine","spine0","spine_01","j_bip_c_spine"],                                3),
+    (["leftleg","leftknee","l_leg","lleg","leftlowerleg",
+      "left_lower_leg","lowerleg_l","knee_l","j_bip_l_lowerleg"],                  4),
+    (["rightleg","rightknee","r_leg","rleg","rightlowerleg",
+      "right_lower_leg","lowerleg_r","knee_r","j_bip_r_lowerleg"],                 5),
+    (["spine1","spine_02","j_bip_c_spine1"],                                       6),
+    (["leftfoot","left_foot","l_foot","lfoot","foot_l","j_bip_l_foot"],            7),
+    (["rightfoot","right_foot","r_foot","rfoot","foot_r","j_bip_r_foot"],          8),
+    (["spine2","spine_03","j_bip_c_spine2","chest"],                               9),
+    (["lefttoebase","lefttoe","l_toe","ltoe","toe_l"],                            10),
+    (["righttoebase","righttoe","r_toe","rtoe","toe_r"],                           11),
+    (["neck","j_bip_c_neck"],                                                     12),
+    (["leftshoulder","leftcollar","l_shoulder","leftclavicle",
+      "clavicle_l","j_bip_l_shoulder"],                                           13),
+    (["rightshoulder","rightcollar","r_shoulder","rightclavicle",
+      "clavicle_r","j_bip_r_shoulder"],                                           14),
+    (["head","j_bip_c_head"],                                                     15),
+    (["leftarm","leftupper","l_arm","larm","leftupperarm",
+      "upperarm_l","j_bip_l_upperarm"],                                           16),
+    (["rightarm","rightupper","r_arm","rarm","rightupperarm",
+      "upperarm_r","j_bip_r_upperarm"],                                           17),
+    (["leftforearm","leftlower","l_forearm","lforearm",
+      "lowerarm_l","j_bip_l_lowerarm"],                                           18),
+    (["rightforearm","rightlower","r_forearm","rforearm",
+      "lowerarm_r","j_bip_r_lowerarm"],                                           19),
+    (["lefthand","l_hand","lhand","hand_l","j_bip_l_hand"],                       20),
+    (["righthand","r_hand","rhand","hand_r","j_bip_r_hand"],                      21),
+]
+def _strip_name(name: str) -> str:
+    """Remove common rig namespace prefixes, then lower-case, remove separators."""
+    name = re.sub(r'^(mixamorig:|j_bip_[lcr]_|cc_base_|bip01_|rig:|chr:)',
+                  "", name, flags=re.IGNORECASE)
+    return re.sub(r'[_\-\s.]', "", name).lower()
+def _normalise_positions(pos: np.ndarray) -> np.ndarray:
+    """Normalise [N, 3] to [0,1] in Y, [-1,1] in X and Z."""
+    y_min, y_max = pos[:, 1].min(), pos[:, 1].max()
+    h = (y_max - y_min) or 1.0
+    xr = (pos[:, 0].max() - pos[:, 0].min()) or 1.0
+    zr = (pos[:, 2].max() - pos[:, 2].min()) or 1.0
+    out = pos.copy()
+    out[:, 0] /= xr
+    out[:, 1]  = (out[:, 1] - y_min) / h
+    out[:, 2] /= zr
+    return out
+def auto_map(skel: UnirigSkeleton, verbose: bool = True) -> None:
+    """
+    Assign skel.bones[name].smpl_idx for each UniRig bone that best matches
+    an SMPL joint.  Score = 0.6 * position_proximity + 0.4 * name_hint.
+    Greedy: each SMPL joint taken by at most one UniRig bone.
+    Bones with combined score < 0.35 are left unmapped (identity in BVH).
+    """
+    names   = list(skel.bones.keys())
+    ur_pos  = np.stack([skel.bones[n].world_rest_pos for n in names])   # [M, 3]
+    # Scale SMPL T-pose to match UniRig height
+    ur_h   = ur_pos[:, 1].max() - ur_pos[:, 1].min()
+    sm_h   = SMPL_TPOSE[:, 1].max() - SMPL_TPOSE[:, 1].min()
+    sm_pos = SMPL_TPOSE * ((ur_h / sm_h) if sm_h > 1e-6 else 1.0)
+    all_norm = _normalise_positions(np.concatenate([ur_pos, sm_pos]))
+    ur_norm  = all_norm[:len(names)]
+    sm_norm  = all_norm[len(names):]
+    # Distance score [M, 22]
+    dist     = np.linalg.norm(ur_norm[:, None] - sm_norm[None], axis=-1)
+    dist_sc  = 1.0 - np.clip(dist / (dist.max() + 1e-9), 0, 1)
+    # Name score [M, 22]
+    norm_names = [_strip_name(n) for n in names]
+    name_sc    = np.array(
+        [[1.0 if norm in kws else 0.0
+          for kws, _ in _NAME_HINTS]
+         for norm in norm_names],
+        dtype=np.float32,
+    )  # [M, 22]
+    combined = 0.6 * dist_sc + 0.4 * name_sc    # [M, 22]
+    # Greedy assignment
+    THRESHOLD   = 0.35
+    taken_smpl: set[int] = set()
+    pairs = sorted(
+        ((i, j, combined[i, j])
+         for i in range(len(names)) for j in range(NUM_SMPL)),
+        key=lambda x: -x[2],
+    )
+    for bi, si, score in pairs:
+        if score < THRESHOLD:
+            break
+        name = names[bi]
+        if skel.bones[name].smpl_idx is not None or si in taken_smpl:
+            continue
+        skel.bones[name].smpl_idx = si
+        taken_smpl.add(si)
+    if verbose:
+        mapped   = [(n, b.smpl_idx) for n, b in skel.bones.items() if b.smpl_idx is not None]
+        unmapped = [n for n, b in skel.bones.items() if b.smpl_idx is None]
+        print(f"\n[MAP] {len(mapped)}/{len(skel.bones)} bones mapped to SMPL joints:")
+        for ur_name, si in sorted(mapped, key=lambda x: x[1]):
+            sc = combined[names.index(ur_name), si]
+            print(f"       {ur_name:40s} -> {SMPL_NAMES[si]:16s}  score={sc:.2f}")
+        if unmapped:
+            print(f"[MAP] {len(unmapped)} unmapped (identity rotation): "
+                  + ", ".join(unmapped[:8])
+                  + (" ..." if len(unmapped) > 8 else ""))
+        print()
+# ══════════════════════════════════════════════════════════════════════════════
+# BVH writers
+# ══════════════════════════════════════════════════════════════════════════════
+def _smpl_offsets(tpose: np.ndarray) -> np.ndarray:
+    offsets = np.zeros_like(tpose)
+    for j, p in enumerate(SMPL_PARENT):
+        offsets[j] = tpose[j] if p < 0 else tpose[j] - tpose[p]
+    return offsets
+def write_bvh_smpl(output_path: str, positions: np.ndarray, fps: int = 20) -> None:
+    """BVH with standard SMPL bone names (no rig file needed)."""
+    T       = positions.shape[0]
+    tpose   = _scale_smpl_tpose(positions)
+    offsets = _smpl_offsets(tpose)
+    tp_w    = tpose + (positions[0, 0] - tpose[0])
+    local_q = positions_to_local_quats(positions, tp_w)
+    euler   = quat_to_euler_ZXY(local_q)
+    with open(output_path, "w") as f:
+        f.write("HIERARCHY\n")
+        def wj(j, ind):
+            off = offsets[j]
+            f.write(f"{'ROOT' if SMPL_PARENT[j]<0 else ind+'JOINT'} {SMPL_NAMES[j]}\n")
+            f.write(f"{ind}{{\n")
+            f.write(f"{ind}\tOFFSET {off[0]:.6f} {off[1]:.6f} {off[2]:.6f}\n")
+            if SMPL_PARENT[j] < 0:
+                f.write(f"{ind}\tCHANNELS 6 Xposition Yposition Zposition "
+                        "Zrotation Xrotation Yrotation\n")
+            else:
+                f.write(f"{ind}\tCHANNELS 3 Zrotation Xrotation Yrotation\n")
+            for c in _SMPL_CHILDREN[j]:
+                wj(c, ind + "\t")
+            if not _SMPL_CHILDREN[j]:
+                f.write(f"{ind}\tEnd Site\n{ind}\t{{\n"
+                        f"{ind}\t\tOFFSET 0.000000 0.050000 0.000000\n{ind}\t}}\n")
+            f.write(f"{ind}}}\n")
+        wj(0, "")
+        f.write(f"MOTION\nFrames: {T}\nFrame Time: {1.0/fps:.8f}\n")
+        for t in range(T):
+            rp  = positions[t, 0]
+            row = [f"{rp[0]:.6f}", f"{rp[1]:.6f}", f"{rp[2]:.6f}"]
+            for j in SMPL_DFS:
+                rz, rx, ry = euler[t, j]
+                row += [f"{rz:.6f}", f"{rx:.6f}", f"{ry:.6f}"]
+            f.write(" ".join(row) + "\n")
+    print(f"[OK] {T} frames @ {fps} fps -> {output_path}  (SMPL skeleton)")
+def write_bvh_unirig(output_path: str,
+                     positions: np.ndarray,
+                     skel: UnirigSkeleton,
+                     fps: int = 20) -> None:
+    """
+    BVH using UniRig bone names and hierarchy.
+    Mapped bones receive SMPL-derived local rotations with rest-pose correction.
+    Unmapped bones (fingers, face bones, etc.) are set to identity.
+    """
+    T = positions.shape[0]
+    # Compute SMPL local quaternions
+    tpose   = _scale_smpl_tpose(positions)
+    tp_w    = tpose + (positions[0, 0] - tpose[0])
+    smpl_q  = positions_to_local_quats(positions, tp_w)    # [T, 22, 4]
+    smpl_rd = _rest_dirs(tp_w, _SMPL_CHILDREN, SMPL_PARENT)  # [22, 3]
+    # Rest-pose correction quaternions per bone:
+    #   q_corr = qbetween(unirig_rest_dir, smpl_rest_dir)
+    #   unirig_local_q = smpl_local_q @ q_corr
+    # This ensures: when applied to unirig_rest_dir, the result matches
+    # the SMPL animated direction — accounting for any difference in
+    # rest-pose bone orientations between the two skeletons.
+    corrections: dict[str, np.ndarray] = {}
+    for name, bone in skel.bones.items():
+        if bone.smpl_idx is None:
+            continue
+        ur_rd = skel.rest_direction(name).astype(np.float32)
+        sm_rd = smpl_rd[bone.smpl_idx].astype(np.float32)
+        corrections[name] = qbetween(ur_rd[None], sm_rd[None])[0]  # [4]
+    # Scale root translation from SMPL proportions to UniRig proportions
+    ur_h    = (max(b.world_rest_pos[1] for b in skel.bones.values())
+               - min(b.world_rest_pos[1] for b in skel.bones.values()))
+    sm_h    = tp_w[:, 1].max() - tp_w[:, 1].min()
+    pos_sc  = (ur_h / sm_h) if sm_h > 1e-6 else 1.0
+    dfs     = skel.dfs_order()
+    offsets = skel.local_offsets()
+    # Pre-compute euler per bone [T, 3]
+    ID_EUL = np.zeros((T, 3), dtype=np.float32)
+    bone_euler: dict[str, np.ndarray] = {}
+    for name, bone in skel.bones.items():
+        if bone.smpl_idx is not None:
+            q = smpl_q[:, bone.smpl_idx].copy()               # [T, 4]
+            c = corrections.get(name)
+            if c is not None:
+                q = qnorm(qmul(q, np.broadcast_to(c[None], q.shape).copy()))
+            bone_euler[name] = quat_to_euler_ZXY(q)           # [T, 3]
+        else:
+            bone_euler[name] = ID_EUL
+    with open(output_path, "w") as f:
+        f.write("HIERARCHY\n")
+        def wj(name, ind):
+            off  = offsets[name]
+            bone = skel.bones[name]
+            f.write(f"{'ROOT' if bone.parent is None else ind+'JOINT'} {name}\n")
+            f.write(f"{ind}{{\n")
+            f.write(f"{ind}\tOFFSET {off[0]:.6f} {off[1]:.6f} {off[2]:.6f}\n")
+            if bone.parent is None:
+                f.write(f"{ind}\tCHANNELS 6 Xposition Yposition Zposition "
+                        "Zrotation Xrotation Yrotation\n")
+            else:
+                f.write(f"{ind}\tCHANNELS 3 Zrotation Xrotation Yrotation\n")
+            for c in bone.children:
+                wj(c, ind + "\t")
+            if not bone.children:
+                f.write(f"{ind}\tEnd Site\n{ind}\t{{\n"
+                        f"{ind}\t\tOFFSET 0.000000 0.050000 0.000000\n{ind}\t}}\n")
+            f.write(f"{ind}}}\n")
+        wj(skel.root.name, "")
+        f.write(f"MOTION\nFrames: {T}\nFrame Time: {1.0/fps:.8f}\n")
+        for t in range(T):
+            rp  = positions[t, 0] * pos_sc
+            row = [f"{rp[0]:.6f}", f"{rp[1]:.6f}", f"{rp[2]:.6f}"]
+            for name in dfs:
+                rz, rx, ry = bone_euler[name][t]
+                row += [f"{rz:.6f}", f"{rx:.6f}", f"{ry:.6f}"]
+            f.write(" ".join(row) + "\n")
+    n_mapped = sum(1 for b in skel.bones.values() if b.smpl_idx is not None)
+    print(f"[OK] {T} frames @ {fps} fps -> {output_path}  "
+          f"(UniRig: {n_mapped} driven, {len(skel.bones)-n_mapped} identity)")
+# ══════════════════════════════════════════════════════════════════════════════
+# Motion loader
+# ══════════════════════════════════════════════════════════════════════════════
+def load_motion(npy_path: str) -> tuple[np.ndarray, int]:
+    """Return (positions [T, 22, 3], fps).  Auto-detects HumanML3D format."""
+    data = np.load(npy_path).astype(np.float32)
+    print(f"[INFO] {npy_path}  shape={data.shape}")
+    if data.ndim == 3 and data.shape[1] == 22 and data.shape[2] == 3:
+        print("[INFO] Format: new_joints [T, 22, 3]")
+        return data, 20
+    if data.ndim == 2 and data.shape[1] == 263:
+        print("[INFO] Format: new_joint_vecs [T, 263]")
+        pos = recover_from_ric(data, 22)
+        print(f"[INFO] Recovered positions {pos.shape}")
+        return pos, 20
+    if data.ndim == 2 and data.shape[1] == 272:
+        print("[INFO] Format: 272-dim (30 fps)")
+        return recover_from_ric(data[:, :263], 22), 30
+    if (data.ndim == 2 and data.shape[1] == 251) or \
+       (data.ndim == 3 and data.shape[1] == 21):
+        sys.exit("[ERROR] KIT-ML (21-joint) format not yet supported.")
+    sys.exit(f"[ERROR] Unrecognised shape {data.shape}. "
+             "Expected [T,22,3] or [T,263].")
+# ══════════════════════════════════════════════════════════════════════════════
+# CLI
+# ══════════════════════════════════════════════════════════════════════════════
+def main() -> None:
+    ap = argparse.ArgumentParser(
+        description="HumanML3D .npy -> BVH, optionally retargeted to UniRig skeleton",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples
+  python humanml3d_to_bvh.py 000001.npy
+      Standard SMPL-named BVH (no rig file needed)
+  python humanml3d_to_bvh.py 000001.npy --rig rigged_mesh.glb
+      BVH retargeted to UniRig bone names, auto-mapped by position + name
+  python humanml3d_to_bvh.py 000001.npy --rig rigged_mesh.glb -o anim.bvh --fps 20
+Supported --rig formats
+  .glb / .gltf   UniRig merge.sh output (requires: pip install pygltflib)
+  .fbx            ASCII FBX only (binary FBX: convert to GLB first)
+""")
+    ap.add_argument("input",          help="HumanML3D .npy motion file")
+    ap.add_argument("--rig",          default=None,
+                    help="UniRig-rigged mesh .glb or ASCII .fbx for auto-mapping")
+    ap.add_argument("-o", "--output", default=None, help="Output .bvh path")
+    ap.add_argument("--fps",          type=int, default=0,
+                    help="Override FPS (default: auto from format)")
+    ap.add_argument("--quiet",        action="store_true",
+                    help="Suppress mapping table")
+    args = ap.parse_args()
+    inp = Path(args.input)
+    out = Path(args.output) if args.output else inp.with_suffix(".bvh")
+    positions, auto_fps = load_motion(str(inp))
+    fps = args.fps if args.fps > 0 else auto_fps
+    if args.rig:
+        ext = Path(args.rig).suffix.lower()
+        if ext in (".glb", ".gltf"):
+            skel = parse_glb_skeleton(args.rig)
+        elif ext == ".fbx":
+            skel = parse_fbx_ascii_skeleton(args.rig)
+        else:
+            sys.exit(f"[ERROR] Unsupported rig format: {ext}  (use .glb or .fbx)")
+        auto_map(skel, verbose=not args.quiet)
+        write_bvh_unirig(str(out), positions, skel, fps=fps)
+    else:
+        write_bvh_smpl(str(out), positions, fps=fps)
+if __name__ == "__main__":
+    main()

Retarget/io/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """rig_retarget.io — file format readers / writers."""

Retarget/io/bvh.py ADDED Viewed

	@@ -0,0 +1,216 @@

+"""
+io/bvh.py
+BVH (Biovision Hierarchy) reader.
+Returns an Armature in rest pose plus an iterator / list of frame states.
+Each frame state sets the bone pose_rotation_quat / pose_location on the
+source armature so that retarget.get_bone_ws_quat / get_bone_position_ws
+return the correct world-space values.
+"""
+from __future__ import annotations
+import math
+import re
+from typing import Dict, List, Optional, Tuple
+import numpy as np
+from ..skeleton import Armature, PoseBone
+from ..math3d import translation_matrix, euler_to_quat, quat_identity, vec3
+# ---------------------------------------------------------------------------
+# Internal BVH data structures
+# ---------------------------------------------------------------------------
+class _BVHJoint:
+    def __init__(self, name: str):
+        self.name = name
+        self.offset: np.ndarray = vec3()
+        self.channels: List[str] = []
+        self.children: List["_BVHJoint"] = []
+        self.is_end_site: bool = False
+# ---------------------------------------------------------------------------
+# Parser
+# ---------------------------------------------------------------------------
+def _tokenize(text: str) -> List[str]:
+    return re.split(r"[\s]+", text.strip())
+def _parse_hierarchy(tokens: List[str], idx: int) -> Tuple[_BVHJoint, int]:
+    """Parse one joint block.  idx should point to joint name."""
+    name = tokens[idx]; idx += 1
+    joint = _BVHJoint(name)
+    assert tokens[idx] == "{", f"Expected '{{' got '{tokens[idx]}'"
+    idx += 1
+    while tokens[idx] != "}":
+        kw = tokens[idx].upper()
+        if kw == "OFFSET":
+            joint.offset = np.array([float(tokens[idx+1]), float(tokens[idx+2]), float(tokens[idx+3])])
+            idx += 4
+        elif kw == "CHANNELS":
+            n = int(tokens[idx+1]); idx += 2
+            joint.channels = [tokens[idx+i].upper() for i in range(n)]
+            idx += n
+        elif kw == "JOINT":
+            idx += 1
+            child, idx = _parse_hierarchy(tokens, idx)
+            joint.children.append(child)
+        elif kw == "END" and tokens[idx+1].upper() == "SITE":
+            # End Site block — just parse and discard
+            idx += 2
+            assert tokens[idx] == "{"; idx += 1
+            while tokens[idx] != "}":
+                idx += 1
+            idx += 1  # skip '}'
+        else:
+            idx += 1   # unknown token, skip
+    idx += 1  # skip '}'
+    return joint, idx
+def _collect_joints(joint: _BVHJoint) -> List[_BVHJoint]:
+    result = [joint]
+    for c in joint.children:
+        result.extend(_collect_joints(c))
+    return result
+# ---------------------------------------------------------------------------
+# Build Armature from BVH hierarchy
+# ---------------------------------------------------------------------------
+def _build_armature(root_joint: _BVHJoint) -> Armature:
+    arm = Armature("BVH_Source")
+    def add_recursive(j: _BVHJoint, parent_name: Optional[str], parent_world: np.ndarray):
+        # rest_matrix_local = T(offset) relative to parent
+        rest_local = translation_matrix(j.offset)
+        bone = PoseBone(j.name, rest_local)
+        arm.add_bone(bone, parent_name)
+        world = parent_world @ rest_local
+        for child in j.children:
+            add_recursive(child, j.name, world)
+    add_recursive(root_joint, None, np.eye(4))
+    arm.update_fk()
+    return arm
+# ---------------------------------------------------------------------------
+# Frame application
+# ---------------------------------------------------------------------------
+_CHANNEL_MAP = {
+    "XROTATION": ("rx",), "YROTATION": ("ry",), "ZROTATION": ("rz",),
+    "XPOSITION": ("tx",), "YPOSITION": ("ty",), "ZPOSITION": ("tz",),
+}
+def _apply_frame(arm: Armature, all_joints: List[_BVHJoint], values: List[float]) -> None:
+    """Set bone poses for one BVH frame."""
+    vi = 0
+    for j in all_joints:
+        tx = ty = tz = 0.0
+        rx = ry = rz = 0.0
+        for ch in j.channels:
+            key = _CHANNEL_MAP.get(ch, None)
+            if key:
+                val = values[vi]
+                k = key[0]
+                if k == "tx": tx = val
+                elif k == "ty": ty = val
+                elif k == "tz": tz = val
+                elif k == "rx": rx = math.radians(val)
+                elif k == "ry": ry = math.radians(val)
+                elif k == "rz": rz = math.radians(val)
+            vi += 1
+        if j.name not in arm.pose_bones:
+            continue
+        bone = arm.pose_bones[j.name]
+        # BVH rotation order is specified per channel list; rebuild from order
+        rot_channels = [c for c in j.channels if "ROTATION" in c]
+        order = "".join(c[0] for c in rot_channels)  # e.g. "ZXY"
+        angles = {"X": rx, "Y": ry, "Z": rz}
+        angle_seq = [angles[a] for a in order]
+        bone.pose_rotation_quat = euler_to_quat(*angle_seq, order=order)
+        # Translation — only root joints typically have it
+        if tx or ty or tz:
+            bone.pose_location = np.array([tx, ty, tz])
+    arm.update_fk()
+# ---------------------------------------------------------------------------
+# Public API
+# ---------------------------------------------------------------------------
+class BVHAnimation:
+    """Loaded BVH file.  Iterate frames by calling advance(frame_index)."""
+    def __init__(
+        self,
+        armature: Armature,
+        all_joints: List[_BVHJoint],
+        frame_data: List[List[float]],
+        frame_time: float,
+    ):
+        self.armature = armature
+        self._all_joints = all_joints
+        self._frame_data = frame_data
+        self.frame_time = frame_time
+        self.num_frames = len(frame_data)
+    def apply_frame(self, frame_index: int) -> None:
+        """Advance armature to frame_index and update FK."""
+        if frame_index < 0 or frame_index >= self.num_frames:
+            raise IndexError(f"Frame {frame_index} out of range [0, {self.num_frames})")
+        _apply_frame(self.armature, self._all_joints, self._frame_data[frame_index])
+def load_bvh(filepath: str) -> BVHAnimation:
+    """
+    Parse a BVH file.
+    Returns BVHAnimation with an Armature ready for retargeting.
+    """
+    with open(filepath, "r") as f:
+        text = f.read()
+    tokens = _tokenize(text)
+    idx = 0
+    # Expect HIERARCHY keyword
+    while tokens[idx].upper() != "HIERARCHY":
+        idx += 1
+    idx += 1
+    root_kw = tokens[idx].upper()
+    assert root_kw in ("ROOT", "JOINT"), f"Expected ROOT/JOINT, got '{tokens[idx]}'"
+    idx += 1
+    root_joint, idx = _parse_hierarchy(tokens, idx)
+    # MOTION section
+    while tokens[idx].upper() != "MOTION":
+        idx += 1
+    idx += 1
+    assert tokens[idx].upper() == "FRAMES:"; idx += 1
+    num_frames = int(tokens[idx]); idx += 1
+    assert tokens[idx].upper() == "FRAME"; assert tokens[idx+1].upper() == "TIME:"; idx += 2
+    frame_time = float(tokens[idx]); idx += 1
+    all_joints = _collect_joints(root_joint)
+    total_channels = sum(len(j.channels) for j in all_joints)
+    frame_data: List[List[float]] = []
+    for _ in range(num_frames):
+        row = [float(tokens[idx + k]) for k in range(total_channels)]
+        idx += total_channels
+        frame_data.append(row)
+    arm = _build_armature(root_joint)
+    return BVHAnimation(arm, all_joints, frame_data, frame_time)

Retarget/io/gltf_io.py ADDED Viewed

	@@ -0,0 +1,316 @@

+"""
+io/gltf_io.py
+Load a glTF/GLB skeleton (e.g. UniRig output) into an Armature.
+Write retargeted animation back into a glTF/GLB file.
+Requires:  pip install pygltflib
+"""
+from __future__ import annotations
+import base64
+import json
+import struct
+from pathlib import Path
+from typing import Dict, List, Optional, Tuple
+import numpy as np
+try:
+    import pygltflib
+except ImportError:
+    raise ImportError("pip install pygltflib")
+from ..skeleton import Armature, PoseBone
+from ..math3d import (
+    quat_identity, quat_normalize, matrix4_to_quat, matrix4_to_trs,
+    trs_to_matrix4, vec3,
+)
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+def _node_local_trs(node: "pygltflib.Node"):
+    """Extract TRS from a glTF node.  Returns (t[3], r_wxyz[4], s[3])."""
+    t = np.array(node.translation or [0.0, 0.0, 0.0])
+    r_xyzw = np.array(node.rotation or [0.0, 0.0, 0.0, 1.0])
+    s = np.array(node.scale or [1.0, 1.0, 1.0])
+    # Convert glTF (x,y,z,w) → our (w,x,y,z)
+    r_wxyz = np.array([r_xyzw[3], r_xyzw[0], r_xyzw[1], r_xyzw[2]])
+    return t, r_wxyz, s
+def _node_local_matrix(node: "pygltflib.Node") -> np.ndarray:
+    if node.matrix:
+        # glTF stores column-major; convert to row-major
+        m = np.array(node.matrix, dtype=float).reshape(4, 4).T
+        return m
+    t, r, s = _node_local_trs(node)
+    return trs_to_matrix4(t, r, s)
+def _read_accessor(gltf: "pygltflib.GLTF2", accessor_idx: int) -> np.ndarray:
+    """Read a glTF accessor into a numpy array."""
+    acc = gltf.accessors[accessor_idx]
+    bv = gltf.bufferViews[acc.bufferView]
+    buf = gltf.buffers[bv.buffer]
+    # Inline base64 data URI
+    if buf.uri and buf.uri.startswith("data:"):
+        _, b64 = buf.uri.split(",", 1)
+        raw = base64.b64decode(b64)
+    elif buf.uri:
+        base_dir = Path(gltf._path).parent if hasattr(gltf, "_path") and gltf._path else Path(".")
+        raw = (base_dir / buf.uri).read_bytes()
+    else:
+        # Binary GLB — data stored in gltf.binary_blob
+        raw = bytes(gltf.binary_blob())
+    start = bv.byteOffset + (acc.byteOffset or 0)
+    count = acc.count
+    type_to_components = {
+        "SCALAR": 1, "VEC2": 2, "VEC3": 3, "VEC4": 4,
+        "MAT2": 4, "MAT3": 9, "MAT4": 16,
+    }
+    component_type_to_fmt = {
+        5120: "b", 5121: "B", 5122: "h", 5123: "H",
+        5125: "I", 5126: "f",
+    }
+    n_comp = type_to_components[acc.type]
+    fmt = component_type_to_fmt[acc.componentType]
+    item_size = struct.calcsize(fmt) * n_comp
+    stride = bv.byteStride or item_size
+    items = []
+    for i in range(count):
+        offset = start + i * stride
+        vals = struct.unpack_from(f"{n_comp}{fmt}", raw, offset)
+        items.append(vals)
+    return np.array(items, dtype=float).squeeze()
+# ---------------------------------------------------------------------------
+# Load skeleton from glTF
+# ---------------------------------------------------------------------------
+def load_gltf(filepath: str, skin_index: int = 0) -> Armature:
+    """
+    Load the first (or specified) skin from a glTF/GLB file into an Armature.
+    The armature world_matrix is set to identity (typical for UniRig output).
+    """
+    gltf = pygltflib.GLTF2().load(filepath)
+    gltf._path = filepath
+    if not gltf.skins:
+        raise ValueError(f"No skins found in '{filepath}'")
+    skin = gltf.skins[skin_index]
+    # Read inverse bind matrices
+    n_joints = len(skin.joints)
+    ibm_array: Optional[np.ndarray] = None
+    if skin.inverseBindMatrices is not None:
+        raw = _read_accessor(gltf, skin.inverseBindMatrices)
+        ibm_array = raw.reshape(n_joints, 4, 4)
+    # Compute bind-pose world matrices: world_bind = inv(ibm)
+    joint_world_bind: Dict[int, np.ndarray] = {}
+    for i, j_idx in enumerate(skin.joints):
+        if ibm_array is not None:
+            ibm = ibm_array[i].T   # glTF column-major → numpy row-major
+            joint_world_bind[j_idx] = np.linalg.inv(ibm)
+        else:
+            # Fallback: compute from FK over node local matrices
+            joint_world_bind[j_idx] = np.eye(4)
+    # Build parent map for nodes
+    parent_of: Dict[int, Optional[int]] = {}
+    for ni, node in enumerate(gltf.nodes):
+        for child_idx in (node.children or []):
+            parent_of[child_idx] = ni
+    arm = Armature(skin.name or f"Skin_{skin_index}")
+    # Process joints in order (parent always before child in glTF spec)
+    joint_set = set(skin.joints)
+    processed: Dict[int, str] = {}
+    for i, j_idx in enumerate(skin.joints):
+        node = gltf.nodes[j_idx]
+        bone_name = node.name or f"joint_{i}"
+        # Find parent joint node
+        parent_node_idx = parent_of.get(j_idx)
+        parent_bone_name: Optional[str] = None
+        while parent_node_idx is not None:
+            if parent_node_idx in joint_set:
+                parent_bone_name = processed.get(parent_node_idx)
+                break
+            parent_node_idx = parent_of.get(parent_node_idx)
+        # rest_matrix_local in parent space
+        if parent_bone_name and parent_bone_name in [b for b in processed.values()]:
+            parent_world = joint_world_bind.get(
+                next(k for k, v in processed.items() if v == parent_bone_name),
+                np.eye(4)
+            )
+            rest_local = np.linalg.inv(parent_world) @ joint_world_bind[j_idx]
+        else:
+            rest_local = joint_world_bind[j_idx]
+        bone = PoseBone(bone_name, rest_local)
+        arm.add_bone(bone, parent_bone_name)
+        processed[j_idx] = bone_name
+    arm.update_fk()
+    return arm
+# ---------------------------------------------------------------------------
+# Write animation to glTF
+# ---------------------------------------------------------------------------
+def write_gltf_animation(
+    source_filepath: str,
+    dest_armature: Armature,
+    keyframes: List[Dict[str, Tuple[np.ndarray, np.ndarray, np.ndarray]]],
+    output_filepath: str,
+    fps: float = 30.0,
+    skin_index: int = 0,
+) -> None:
+    """
+    Embed animation keyframes into a copy of source_filepath (the UniRig GLB).
+    keyframes:  list of dicts, one per frame.
+                Each dict maps bone_name → (pose_location, pose_rotation_quat, pose_scale)
+                These are LOCAL values (relative to rest pose local matrix).
+    The function adds one glTF Animation with channels for each bone that has data.
+    """
+    gltf = pygltflib.GLTF2().load(source_filepath)
+    gltf._path = source_filepath
+    if not gltf.skins:
+        raise ValueError("No skins in source file")
+    skin = gltf.skins[skin_index]
+    # Build node_name → node_index map for skin joints
+    joint_name_to_node: Dict[str, int] = {}
+    for j_idx in skin.joints:
+        node = gltf.nodes[j_idx]
+        name = node.name or f"joint_{j_idx}"
+        joint_name_to_node[name] = j_idx
+    n_frames = len(keyframes)
+    times = np.array([i / fps for i in range(n_frames)], dtype=np.float32)
+    # Gather binary data
+    binary_chunks: List[bytes] = []
+    accessors: List[dict] = []
+    buffer_views: List[dict] = []
+    def _add_data(data: np.ndarray, acc_type: str) -> int:
+        """Append numpy array to binary, return accessor index."""
+        raw = data.astype(np.float32).tobytes()
+        bv_offset = sum(len(c) for c in binary_chunks)
+        binary_chunks.append(raw)
+        bv_idx = len(gltf.bufferViews)
+        gltf.bufferViews.append(pygltflib.BufferView(
+            buffer=0,
+            byteOffset=bv_offset,
+            byteLength=len(raw),
+        ))
+        acc_idx = len(gltf.accessors)
+        gltf.accessors.append(pygltflib.Accessor(
+            bufferView=bv_idx,
+            componentType=pygltflib.FLOAT,
+            count=len(data),
+            type=acc_type,
+            max=data.max(axis=0).tolist() if data.ndim > 1 else [float(data.max())],
+            min=data.min(axis=0).tolist() if data.ndim > 1 else [float(data.min())],
+        ))
+        return acc_idx
+    time_acc_idx = _add_data(times, "SCALAR")
+    channels: List[pygltflib.AnimationChannel] = []
+    samplers: List[pygltflib.AnimationSampler] = []
+    bone_names = set()
+    for frame in keyframes:
+        bone_names |= frame.keys()
+    for bone_name in sorted(bone_names):
+        if bone_name not in joint_name_to_node:
+            continue
+        node_idx = joint_name_to_node[bone_name]
+        node = gltf.nodes[node_idx]
+        # Collect TRS arrays across frames
+        rot_data = np.zeros((n_frames, 4), dtype=np.float32)  # (x,y,z,w)
+        trans_data = np.zeros((n_frames, 3), dtype=np.float32)
+        scale_data = np.ones((n_frames, 3), dtype=np.float32)
+        rest_t, rest_r, rest_s = _node_local_trs(node)
+        for fi, frame in enumerate(keyframes):
+            if bone_name in frame:
+                pose_loc, pose_rot, pose_scale = frame[bone_name]
+            else:
+                pose_loc = vec3()
+                pose_rot = quat_identity()
+                pose_scale = np.ones(3)
+            # Final local = rest + delta (simple addition for translation, multiply for rotation)
+            from ..math3d import quat_mul, trs_to_matrix4
+            final_t = rest_t + pose_loc
+            final_r = quat_mul(rest_r, pose_rot)      # (w,x,y,z)
+            final_s = rest_s * pose_scale
+            # Convert rotation to glTF (x,y,z,w)
+            w, x, y, z = final_r
+            rot_data[fi] = [x, y, z, w]
+            trans_data[fi] = final_t
+            scale_data[fi] = final_s
+        s_idx = len(samplers)
+        rot_acc = _add_data(rot_data, "VEC4")
+        samplers.append(pygltflib.AnimationSampler(input=time_acc_idx, output=rot_acc, interpolation="LINEAR"))
+        channels.append(pygltflib.AnimationChannel(
+            sampler=s_idx,
+            target=pygltflib.AnimationChannelTarget(node=node_idx, path="rotation"),
+        ))
+        s_idx = len(samplers)
+        trans_acc = _add_data(trans_data, "VEC3")
+        samplers.append(pygltflib.AnimationSampler(input=time_acc_idx, output=trans_acc, interpolation="LINEAR"))
+        channels.append(pygltflib.AnimationChannel(
+            sampler=s_idx,
+            target=pygltflib.AnimationChannelTarget(node=node_idx, path="translation"),
+        ))
+    if not channels:
+        print("[gltf_io] Warning: no channels written — check bone name mapping.")
+        return
+    gltf.animations.append(pygltflib.Animation(
+        name="RetargetedAnimation",
+        samplers=samplers,
+        channels=channels,
+    ))
+    # Patch buffer 0 size with our new data
+    new_blob = b"".join(binary_chunks)
+    existing_blob = bytes(gltf.binary_blob()) if gltf.binary_blob() else b""
+    full_blob = existing_blob + new_blob
+    # Update buffer 0 byteOffset of new views
+    for bv in gltf.bufferViews[-len(binary_chunks):]:
+        bv.byteOffset += len(existing_blob)
+    gltf.set_binary_blob(full_blob)
+    gltf.buffers[0].byteLength = len(full_blob)
+    gltf.save(output_filepath)
+    print(f"[gltf_io] Saved animated GLB -> {output_filepath}")

Retarget/io/mapping.py ADDED Viewed

	@@ -0,0 +1,189 @@

+"""
+io/mapping.py
+Load / save bone mapping JSON in the exact same format as KeeMap.
+"""
+from __future__ import annotations
+import json
+from dataclasses import dataclass, field
+from typing import List, Optional
+import numpy as np
+from ..math3d import quat_identity, vec3
+@dataclass
+class BoneMappingItem:
+    name: str = ""
+    label: str = ""
+    description: str = ""
+    source_bone_name: str = ""
+    destination_bone_name: str = ""
+    keyframe_this_bone: bool = True
+    # Rotation correction (Euler, radians)
+    correction_factor: np.ndarray = field(default_factory=lambda: vec3())
+    # Quaternion correction
+    quat_correction_factor: np.ndarray = field(default_factory=quat_identity)
+    has_twist_bone: bool = False
+    twist_bone_name: str = ""
+    set_bone_position: bool = False
+    set_bone_rotation: bool = True
+    set_bone_scale: bool = False
+    # Rotation options
+    bone_rotation_application_axis: str = "XYZ"   # X Y Z XY XZ YZ XYZ
+    bone_transpose_axis: str = "NONE"             # NONE ZXY ZYX XZY YZX YXZ
+    # Position options
+    postion_type: str = "SINGLE_BONE_OFFSET"       # SINGLE_BONE_OFFSET | POLE
+    position_correction_factor: np.ndarray = field(default_factory=lambda: vec3())
+    position_gain: float = 1.0
+    position_pole_distance: float = 0.3
+    # Scale options
+    scale_secondary_bone_name: str = ""
+    bone_scale_application_axis: str = "Y"
+    scale_gain: float = 1.0
+    scale_max: float = 1.0
+    scale_min: float = 0.5
+@dataclass
+class KeeMapSettings:
+    source_rig_name: str = ""
+    destination_rig_name: str = ""
+    bone_mapping_file: str = ""
+    bone_rotation_mode: str = "EULER"           # EULER | QUATERNION
+    start_frame_to_apply: int = 0
+    number_of_frames_to_apply: int = 100
+    keyframe_every_n_frames: int = 1
+    keyframe_test: bool = False
+# ---------------------------------------------------------------------------
+# Load
+# ---------------------------------------------------------------------------
+def load_mapping(filepath: str):
+    """
+    Returns (KeeMapSettings, List[BoneMappingItem]).
+    Reads the exact same JSON that KeeMap writes.
+    """
+    with open(filepath, "r") as f:
+        data = json.load(f)
+    settings = KeeMapSettings(
+        source_rig_name=data.get("source_rig_name", ""),
+        destination_rig_name=data.get("destination_rig_name", ""),
+        bone_mapping_file=data.get("bone_mapping_file", ""),
+        bone_rotation_mode=data.get("bone_rotation_mode", "EULER"),
+        start_frame_to_apply=data.get("start_frame_to_apply", 0),
+        number_of_frames_to_apply=data.get("number_of_frames_to_apply", 100),
+        keyframe_every_n_frames=data.get("keyframe_every_n_frames", 1),
+    )
+    bones: List[BoneMappingItem] = []
+    for p in data.get("bones", []):
+        item = BoneMappingItem()
+        item.name = p.get("name", "")
+        item.label = p.get("label", "")
+        item.description = p.get("description", "")
+        item.source_bone_name = p.get("SourceBoneName", "")
+        item.destination_bone_name = p.get("DestinationBoneName", "")
+        item.keyframe_this_bone = p.get("keyframe_this_bone", True)
+        item.correction_factor = np.array([
+            p.get("CorrectionFactorX", 0.0),
+            p.get("CorrectionFactorY", 0.0),
+            p.get("CorrectionFactorZ", 0.0),
+        ])
+        item.quat_correction_factor = np.array([
+            p.get("QuatCorrectionFactorw", 1.0),
+            p.get("QuatCorrectionFactorx", 0.0),
+            p.get("QuatCorrectionFactory", 0.0),
+            p.get("QuatCorrectionFactorz", 0.0),
+        ])
+        item.has_twist_bone = p.get("has_twist_bone", False)
+        item.twist_bone_name = p.get("TwistBoneName", "")
+        item.set_bone_position = p.get("set_bone_position", False)
+        item.set_bone_rotation = p.get("set_bone_rotation", True)
+        item.set_bone_scale = p.get("set_bone_scale", False)
+        item.bone_rotation_application_axis = p.get("bone_rotation_application_axis", "XYZ")
+        item.bone_transpose_axis = p.get("bone_transpose_axis", "NONE")
+        item.postion_type = p.get("postion_type", "SINGLE_BONE_OFFSET")
+        item.position_correction_factor = np.array([
+            p.get("position_correction_factorX", 0.0),
+            p.get("position_correction_factorY", 0.0),
+            p.get("position_correction_factorZ", 0.0),
+        ])
+        item.position_gain = p.get("position_gain", 1.0)
+        item.position_pole_distance = p.get("position_pole_distance", 0.3)
+        item.scale_secondary_bone_name = p.get("scale_secondary_bone_name", "")
+        item.bone_scale_application_axis = p.get("bone_scale_application_axis", "Y")
+        item.scale_gain = p.get("scale_gain", 1.0)
+        item.scale_max = p.get("scale_max", 1.0)
+        item.scale_min = p.get("scale_min", 0.5)
+        bones.append(item)
+    return settings, bones
+# ---------------------------------------------------------------------------
+# Save
+# ---------------------------------------------------------------------------
+def save_mapping(filepath: str, settings: KeeMapSettings, bones: List[BoneMappingItem]) -> None:
+    """Write mapping JSON readable by KeeMap."""
+    root = {
+        "source_rig_name": settings.source_rig_name,
+        "destination_rig_name": settings.destination_rig_name,
+        "bone_mapping_file": settings.bone_mapping_file,
+        "bone_rotation_mode": settings.bone_rotation_mode,
+        "start_frame_to_apply": settings.start_frame_to_apply,
+        "number_of_frames_to_apply": settings.number_of_frames_to_apply,
+        "keyframe_every_n_frames": settings.keyframe_every_n_frames,
+        "bones": [],
+    }
+    for b in bones:
+        root["bones"].append({
+            "name": b.name,
+            "label": b.label,
+            "description": b.description,
+            "SourceBoneName": b.source_bone_name,
+            "DestinationBoneName": b.destination_bone_name,
+            "keyframe_this_bone": b.keyframe_this_bone,
+            "CorrectionFactorX": float(b.correction_factor[0]),
+            "CorrectionFactorY": float(b.correction_factor[1]),
+            "CorrectionFactorZ": float(b.correction_factor[2]),
+            "QuatCorrectionFactorw": float(b.quat_correction_factor[0]),
+            "QuatCorrectionFactorx": float(b.quat_correction_factor[1]),
+            "QuatCorrectionFactory": float(b.quat_correction_factor[2]),
+            "QuatCorrectionFactorz": float(b.quat_correction_factor[3]),
+            "has_twist_bone": b.has_twist_bone,
+            "TwistBoneName": b.twist_bone_name,
+            "set_bone_position": b.set_bone_position,
+            "set_bone_rotation": b.set_bone_rotation,
+            "set_bone_scale": b.set_bone_scale,
+            "bone_rotation_application_axis": b.bone_rotation_application_axis,
+            "bone_transpose_axis": b.bone_transpose_axis,
+            "postion_type": b.postion_type,
+            "position_correction_factorX": float(b.position_correction_factor[0]),
+            "position_correction_factorY": float(b.position_correction_factor[1]),
+            "position_correction_factorZ": float(b.position_correction_factor[2]),
+            "position_gain": b.position_gain,
+            "position_pole_distance": b.position_pole_distance,
+            "scale_secondary_bone_name": b.scale_secondary_bone_name,
+            "bone_scale_application_axis": b.bone_scale_application_axis,
+            "scale_gain": b.scale_gain,
+            "scale_max": b.scale_max,
+            "scale_min": b.scale_min,
+        })
+    with open(filepath, "w") as f:
+        json.dump(root, f, indent=2)

Retarget/math3d.py ADDED Viewed

	@@ -0,0 +1,167 @@

+"""
+math3d.py
+Pure numpy / scipy replacement for Blender's mathutils.
+Quaternion convention throughout: (w, x, y, z)
+Matrix convention: row-major, right-multiplied  (Numpy default)
+"""
+from __future__ import annotations
+import numpy as np
+from scipy.spatial.transform import Rotation
+# ---------------------------------------------------------------------------
+# Quaternion helpers  (w, x, y, z)
+# ---------------------------------------------------------------------------
+def quat_identity() -> np.ndarray:
+    return np.array([1.0, 0.0, 0.0, 0.0])
+def quat_normalize(q: np.ndarray) -> np.ndarray:
+    n = np.linalg.norm(q)
+    return q / n if n > 1e-12 else quat_identity()
+def quat_conjugate(q: np.ndarray) -> np.ndarray:
+    """Conjugate == inverse for unit quaternion."""
+    return np.array([q[0], -q[1], -q[2], -q[3]])
+def quat_mul(q1: np.ndarray, q2: np.ndarray) -> np.ndarray:
+    """Quaternion multiplication (Blender @ operator)."""
+    w1, x1, y1, z1 = q1
+    w2, x2, y2, z2 = q2
+    return np.array([
+        w1*w2 - x1*x2 - y1*y2 - z1*z2,
+        w1*x2 + x1*w2 + y1*z2 - z1*y2,
+        w1*y2 - x1*z2 + y1*w2 + z1*x2,
+        w1*z2 + x1*y2 - y1*x2 + z1*w2,
+    ])
+def quat_rotation_difference(q1: np.ndarray, q2: np.ndarray) -> np.ndarray:
+    """
+    Rotation that takes q1 to q2.
+    r  such that  q1 @ r == q2
+    r = conj(q1) @ q2
+    Matches Blender's Quaternion.rotation_difference()
+    """
+    return quat_normalize(quat_mul(quat_conjugate(q1), q2))
+def quat_dot(q1: np.ndarray, q2: np.ndarray) -> float:
+    """Dot product of two quaternions (used for scale retargeting)."""
+    return float(np.dot(q1, q2))
+def quat_to_matrix4(q: np.ndarray) -> np.ndarray:
+    """Unit quaternion (w,x,y,z) → 4×4 rotation matrix."""
+    w, x, y, z = q
+    m = np.array([
+        [1 - 2*(y*y + z*z),     2*(x*y - z*w),     2*(x*z + y*w), 0],
+        [    2*(x*y + z*w), 1 - 2*(x*x + z*z),     2*(y*z - x*w), 0],
+        [    2*(x*z - y*w),     2*(y*z + x*w), 1 - 2*(x*x + y*y), 0],
+        [                0,                 0,                 0,  1],
+    ], dtype=float)
+    return m
+def matrix4_to_quat(m: np.ndarray) -> np.ndarray:
+    """4×4 matrix → unit quaternion (w,x,y,z)."""
+    r = Rotation.from_matrix(m[:3, :3])
+    x, y, z, w = r.as_quat()       # scipy uses (x,y,z,w)
+    q = np.array([w, x, y, z])
+    # Ensure positive w to match Blender convention
+    if q[0] < 0:
+        q = -q
+    return quat_normalize(q)
+# ---------------------------------------------------------------------------
+# Euler ↔ Quaternion
+# ---------------------------------------------------------------------------
+def euler_to_quat(rx: float, ry: float, rz: float, order: str = "XYZ") -> np.ndarray:
+    """Euler angles (radians) to quaternion (w,x,y,z)."""
+    r = Rotation.from_euler(order, [rx, ry, rz])
+    x, y, z, w = r.as_quat()
+    return quat_normalize(np.array([w, x, y, z]))
+def quat_to_euler(q: np.ndarray, order: str = "XYZ") -> np.ndarray:
+    """Quaternion (w,x,y,z) to Euler angles (radians)."""
+    w, x, y, z = q
+    r = Rotation.from_quat([x, y, z, w])
+    return r.as_euler(order)
+# ---------------------------------------------------------------------------
+# Matrix constructors
+# ---------------------------------------------------------------------------
+def translation_matrix(v) -> np.ndarray:
+    m = np.eye(4)
+    m[0, 3] = v[0]
+    m[1, 3] = v[1]
+    m[2, 3] = v[2]
+    return m
+def scale_matrix(s) -> np.ndarray:
+    m = np.eye(4)
+    m[0, 0] = s[0]
+    m[1, 1] = s[1]
+    m[2, 2] = s[2]
+    return m
+def trs_to_matrix4(t, r_quat, s) -> np.ndarray:
+    """Combine translation, rotation (w,x,y,z quat), scale into 4×4."""
+    T = translation_matrix(t)
+    R = quat_to_matrix4(r_quat)
+    S = scale_matrix(s)
+    return T @ R @ S
+def matrix4_to_trs(m: np.ndarray):
+    """Decompose 4×4 into (translation[3], rotation_quat[4], scale[3])."""
+    t = m[:3, 3].copy()
+    sx = np.linalg.norm(m[:3, 0])
+    sy = np.linalg.norm(m[:3, 1])
+    sz = np.linalg.norm(m[:3, 2])
+    s = np.array([sx, sy, sz])
+    rot_m = m[:3, :3].copy()
+    if sx > 1e-12: rot_m[:, 0] /= sx
+    if sy > 1e-12: rot_m[:, 1] /= sy
+    if sz > 1e-12: rot_m[:, 2] /= sz
+    r = Rotation.from_matrix(rot_m)
+    x, y, z, w = r.as_quat()
+    q = np.array([w, x, y, z])
+    if q[0] < 0:
+        q = -q
+    return t, quat_normalize(q), s
+# ---------------------------------------------------------------------------
+# Vector helpers
+# ---------------------------------------------------------------------------
+def vec3(x=0.0, y=0.0, z=0.0) -> np.ndarray:
+    return np.array([x, y, z], dtype=float)
+def get_point_on_vector(initial_pt: np.ndarray, terminal_pt: np.ndarray, distance: float) -> np.ndarray:
+    """
+    Point at 'distance' from initial_pt along (initial_pt → terminal_pt).
+    Matches Blender's get_point_on_vector helper in KeeMapBoneOperators.
+    """
+    n = initial_pt - terminal_pt
+    norm = np.linalg.norm(n)
+    if norm < 1e-12:
+        return initial_pt.copy()
+    n = n / norm
+    return initial_pt - distance * n
+def apply_rotation_matrix4(m: np.ndarray, v: np.ndarray) -> np.ndarray:
+    """Apply only the rotation part of a 4×4 matrix to a 3-vector."""
+    return m[:3, :3] @ v

Retarget/retarget.py ADDED Viewed

	@@ -0,0 +1,586 @@

+"""
+retarget.py
+Pure-Python port of KeeMapBoneOperators.py core math.
+Replaces bpy / mathutils with numpy.  No Blender dependency.
+Public API mirrors the Blender operator flow:
+    get_bone_position_ws(bone, arm)          → np.ndarray(3)
+    get_bone_ws_quat(bone, arm)              → np.ndarray(4)  w,x,y,z
+    set_bone_position_ws(bone, arm, pos)
+    set_bone_rotation(...)
+    set_bone_position(...)
+    set_bone_position_pole(...)
+    set_bone_scale(...)
+    calc_rotation_offset(bone_item, src_arm, dst_arm, settings)
+    calc_location_offset(bone_item, src_arm, dst_arm)
+    transfer_frame(src_arm, dst_arm, bone_items, settings, do_keyframe)
+      → Dict[bone_name → (pose_loc, pose_rot, pose_scale)]
+    transfer_animation(src_anim, dst_arm, bone_items, settings)
+      → List[Dict[bone_name → (pose_loc, pose_rot, pose_scale)]]
+"""
+from __future__ import annotations
+import math
+import sys
+from typing import Dict, List, Optional, Tuple
+import numpy as np
+from .skeleton import Armature, PoseBone
+from .math3d import (
+    quat_identity, quat_normalize, quat_mul, quat_conjugate,
+    quat_rotation_difference, quat_dot,
+    quat_to_matrix4, matrix4_to_quat,
+    euler_to_quat, quat_to_euler,
+    translation_matrix, vec3, get_point_on_vector,
+)
+from .io.mapping import BoneMappingItem, KeeMapSettings
+# ---------------------------------------------------------------------------
+# Progress bar (console)
+# ---------------------------------------------------------------------------
+def _update_progress(job: str, progress: float) -> None:
+    length = 40
+    block = int(round(length * progress))
+    msg = f"\r{job}: [{'#'*block}{'-'*(length-block)}] {round(progress*100, 1)}%"
+    if progress >= 1:
+        msg += " DONE\r\n"
+    sys.stdout.write(msg)
+    sys.stdout.flush()
+# ---------------------------------------------------------------------------
+# World-space position / quaternion getters
+# ---------------------------------------------------------------------------
+def get_bone_position_ws(bone: PoseBone, arm: Armature) -> np.ndarray:
+    """
+    Return world-space position of bone head.
+    Equivalent to Blender's GetBonePositionWS().
+    """
+    ws_matrix = arm.world_matrix @ bone.matrix_armature
+    return ws_matrix[:3, 3].copy()
+def get_bone_ws_quat(bone: PoseBone, arm: Armature) -> np.ndarray:
+    """
+    Return world-space rotation as quaternion (w,x,y,z).
+    Equivalent to Blender's GetBoneWSQuat().
+    """
+    ws_matrix = arm.world_matrix @ bone.matrix_armature
+    return matrix4_to_quat(ws_matrix)
+# ---------------------------------------------------------------------------
+# World-space position setter
+# ---------------------------------------------------------------------------
+def set_bone_position_ws(bone: PoseBone, arm: Armature, position: np.ndarray) -> None:
+    """
+    Move bone so its world-space head = position.
+    Equivalent to Blender's SetBonePositionWS().
+    Strategy:
+      1. Build new armature-space matrix = old rotation + new translation
+      2. Strip parent transform to get new local translation
+      3. Update pose_location so FK matches
+    """
+    # Current armature-space matrix (rotation/scale part preserved)
+    arm_mat = bone.matrix_armature.copy()
+    # Target armature-space position
+    arm_world_inv = np.linalg.inv(arm.world_matrix)
+    target_arm_pos = (arm_world_inv @ np.append(position, 1.0))[:3]
+    # New armature-space matrix with replaced translation
+    new_arm_mat = arm_mat.copy()
+    new_arm_mat[:3, 3] = target_arm_pos
+    # Convert to local (parent-relative) space
+    if bone.parent is not None:
+        parent_arm_mat = bone.parent.matrix_armature
+        new_local = np.linalg.inv(parent_arm_mat) @ new_arm_mat
+    else:
+        new_local = new_arm_mat
+    # Extract translation from new_local = rest_local @ T(pose_loc) @ ...
+    # Approximate: strip rest_local rotation contribution to isolate pose_location
+    rest_inv = np.linalg.inv(bone.rest_matrix_local)
+    pose_delta = rest_inv @ new_local
+    bone.pose_location = pose_delta[:3, 3].copy()
+    # Recompute FK for this bone and its subtree
+    if bone.parent is not None:
+        bone._fk(bone.parent.matrix_armature)
+    else:
+        bone._fk(np.eye(4))
+# ---------------------------------------------------------------------------
+# Rotation setter (core retargeting math)
+# ---------------------------------------------------------------------------
+def set_bone_rotation(
+    src_arm: Armature, src_name: str,
+    dst_arm: Armature, dst_name: str,
+    dst_twist_name: str,
+    correction_quat: np.ndarray,
+    has_twist: bool,
+    xfer_axis: str,
+    transpose: str,
+    mode: str,
+) -> None:
+    """
+    Port of Blender's SetBoneRotation().
+    Drives dst bone rotation to match src bone world-space rotation.
+    mode:       "EULER" | "QUATERNION"
+    xfer_axis:  "X" "Y" "Z" "XY" "XZ" "YZ" "XYZ"
+    transpose:  "NONE" "ZYX" "ZXY" "XZY" "YZX" "YXZ"
+    """
+    src_bone = src_arm.get_bone(src_name)
+    dst_bone = dst_arm.get_bone(dst_name)
+    # ------------------------------------------------------------------
+    # Get source and destination world-space quaternions (current pose)
+    # ------------------------------------------------------------------
+    src_ws_quat = get_bone_ws_quat(src_bone, src_arm)
+    dst_ws_quat = get_bone_ws_quat(dst_bone, dst_arm)
+    # Rotation difference: r such that  dst_ws @ r ≈ src_ws
+    diff = quat_rotation_difference(dst_ws_quat, src_ws_quat)
+    # FinalQuat = dst_local_pose_delta @ diff @ correction
+    final_quat = quat_normalize(
+        quat_mul(quat_mul(dst_bone.pose_rotation_quat, diff), correction_quat)
+    )
+    # ------------------------------------------------------------------
+    # Apply axis masking / transpose (EULER mode)
+    # ------------------------------------------------------------------
+    if mode == "EULER":
+        euler = quat_to_euler(final_quat, order="XYZ")
+        # Transpose axes
+        if transpose == "ZYX":
+            euler = np.array([euler[2], euler[1], euler[0]])
+        elif transpose == "ZXY":
+            euler = np.array([euler[2], euler[0], euler[1]])
+        elif transpose == "XZY":
+            euler = np.array([euler[0], euler[2], euler[1]])
+        elif transpose == "YZX":
+            euler = np.array([euler[1], euler[2], euler[0]])
+        elif transpose == "YXZ":
+            euler = np.array([euler[1], euler[0], euler[2]])
+        # else NONE — no change
+        # Mask axes
+        if xfer_axis == "X":
+            euler[1] = 0.0; euler[2] = 0.0
+        elif xfer_axis == "Y":
+            euler[0] = 0.0; euler[2] = 0.0
+        elif xfer_axis == "Z":
+            euler[0] = 0.0; euler[1] = 0.0
+        elif xfer_axis == "XY":
+            euler[2] = 0.0
+        elif xfer_axis == "XZ":
+            euler[1] = 0.0
+        elif xfer_axis == "YZ":
+            euler[0] = 0.0
+        # XYZ → no masking
+        final_quat = euler_to_quat(euler[0], euler[1], euler[2], order="XYZ")
+        # Twist bone: peel Y rotation off to twist bone
+        if has_twist and dst_twist_name:
+            twist_bone = dst_arm.get_bone(dst_twist_name)
+            y_euler = quat_to_euler(final_quat, order="XYZ")[1]
+            # Remove Y from main bone
+            euler_no_y = quat_to_euler(final_quat, order="XYZ")
+            euler_no_y[1] = 0.0
+            final_quat = euler_to_quat(*euler_no_y, order="XYZ")
+            # Apply Y to twist bone
+            twist_euler = quat_to_euler(twist_bone.pose_rotation_quat, order="XYZ")
+            twist_euler[1] = math.degrees(y_euler)
+            twist_bone.pose_rotation_quat = euler_to_quat(*twist_euler, order="XYZ")
+    else:  # QUATERNION
+        if final_quat[0] < 0:
+            final_quat = -final_quat
+        final_quat = quat_normalize(final_quat)
+    dst_bone.pose_rotation_quat = final_quat
+    # Recompute FK
+    parent = dst_bone.parent
+    dst_bone._fk(parent.matrix_armature if parent else np.eye(4))
+# ---------------------------------------------------------------------------
+# Position setter
+# ---------------------------------------------------------------------------
+def set_bone_position(
+    src_arm: Armature, src_name: str,
+    dst_arm: Armature, dst_name: str,
+    dst_twist_name: str,
+    correction: np.ndarray,
+    gain: float,
+) -> None:
+    """
+    Port of Blender's SetBonePosition().
+    Moves dst bone to match src bone world-space position, with offset/gain.
+    """
+    src_bone = src_arm.get_bone(src_name)
+    dst_bone = dst_arm.get_bone(dst_name)
+    target_ws = get_bone_position_ws(src_bone, src_arm)
+    set_bone_position_ws(dst_bone, dst_arm, target_ws)
+    # Apply correction and gain to pose_location
+    dst_bone.pose_location[0] = (dst_bone.pose_location[0] + correction[0]) * gain
+    dst_bone.pose_location[1] = (dst_bone.pose_location[1] + correction[1]) * gain
+    dst_bone.pose_location[2] = (dst_bone.pose_location[2] + correction[2]) * gain
+    parent = dst_bone.parent
+    dst_bone._fk(parent.matrix_armature if parent else np.eye(4))
+# ---------------------------------------------------------------------------
+# Pole bone position setter
+# ---------------------------------------------------------------------------
+def set_bone_position_pole(
+    src_arm: Armature, src_name: str,
+    dst_arm: Armature, dst_name: str,
+    dst_twist_name: str,
+    pole_distance: float,
+) -> None:
+    """
+    Port of Blender's SetBonePositionPole().
+    Positions an IK pole target relative to source limb geometry.
+    """
+    src_bone = src_arm.get_bone(src_name)
+    dst_bone = dst_arm.get_bone(dst_name)
+    parent_src = src_bone.parent_recursive[0] if src_bone.parent_recursive else src_bone
+    base_parent_ws = get_bone_position_ws(parent_src, src_arm)
+    base_child_ws = get_bone_position_ws(src_bone, src_arm)
+    # Tail = head + Y-axis direction of bone in world space
+    src_ws_mat = src_arm.world_matrix @ src_bone.matrix_armature
+    tail_ws = src_ws_mat[:3, 3] + src_ws_mat[:3, :3] @ np.array([0.0, 1.0, 0.0])
+    length_parent = np.linalg.norm(base_child_ws - base_parent_ws)
+    length_child = np.linalg.norm(tail_ws - base_child_ws)
+    total = length_parent + length_child
+    c_p_ratio = length_parent / total if total > 1e-12 else 0.5
+    length_pp_to_tail = np.linalg.norm(base_parent_ws - tail_ws)
+    average_location = get_point_on_vector(base_parent_ws, tail_ws, length_pp_to_tail * c_p_ratio)
+    distance = np.linalg.norm(base_child_ws - average_location)
+    if distance > 0.001:
+        pole_pos = get_point_on_vector(base_child_ws, average_location, pole_distance)
+        set_bone_position_ws(dst_bone, dst_arm, pole_pos)
+        parent = dst_bone.parent
+        dst_bone._fk(parent.matrix_armature if parent else np.eye(4))
+# ---------------------------------------------------------------------------
+# Scale setter
+# ---------------------------------------------------------------------------
+def set_bone_scale(
+    src_arm: Armature, src_name: str,
+    dst_arm: Armature, dst_name: str,
+    src_scale_bone_name: str,
+    gain: float,
+    axis: str,
+    max_scale: float,
+    min_scale: float,
+) -> None:
+    """
+    Port of Blender's SetBoneScale().
+    Scales dst bone based on dot product between two source bone quaternions.
+    """
+    src_bone = src_arm.get_bone(src_name)
+    dst_bone = dst_arm.get_bone(dst_name)
+    secondary = src_arm.get_bone(src_scale_bone_name)
+    q1 = get_bone_ws_quat(src_bone, src_arm)
+    q2 = get_bone_ws_quat(secondary, src_arm)
+    amount = quat_dot(q1, q2) * gain
+    if amount < 0:
+        amount = -amount
+    amount = max(min_scale, min(max_scale, amount))
+    s = dst_bone.pose_scale
+    if axis == "X":
+        s[0] = amount
+    elif axis == "Y":
+        s[1] = amount
+    elif axis == "Z":
+        s[2] = amount
+    elif axis == "XY":
+        s[0] = s[1] = amount
+    elif axis == "XZ":
+        s[0] = s[2] = amount
+    elif axis == "YZ":
+        s[1] = s[2] = amount
+    else:  # XYZ
+        s[:] = amount
+    parent = dst_bone.parent
+    dst_bone._fk(parent.matrix_armature if parent else np.eye(4))
+# ---------------------------------------------------------------------------
+# Correction calculators
+# ---------------------------------------------------------------------------
+def calc_rotation_offset(
+    item: BoneMappingItem,
+    src_arm: Armature,
+    dst_arm: Armature,
+    settings: KeeMapSettings,
+) -> None:
+    """
+    Auto-compute the rotation correction factor for one bone mapping.
+    Port of Blender's CalcRotationOffset().
+    Modifies item.correction_factor and item.quat_correction_factor in-place.
+    """
+    if not item.source_bone_name or not item.destination_bone_name:
+        return
+    if not src_arm.has_bone(item.source_bone_name):
+        return
+    if not dst_arm.has_bone(item.destination_bone_name):
+        return
+    dst_bone = dst_arm.get_bone(item.destination_bone_name)
+    # Snapshot destination bone state
+    snap_r = dst_bone.pose_rotation_quat.copy()
+    snap_t = dst_bone.pose_location.copy()
+    starting_ws_quat = get_bone_ws_quat(dst_bone, dst_arm)
+    # Apply with identity correction
+    set_bone_rotation(
+        src_arm, item.source_bone_name,
+        dst_arm, item.destination_bone_name,
+        item.twist_bone_name,
+        quat_identity(),
+        False,
+        item.bone_rotation_application_axis,
+        item.bone_transpose_axis,
+        settings.bone_rotation_mode,
+    )
+    dst_arm.update_fk()
+    modified_ws_quat = get_bone_ws_quat(dst_bone, dst_arm)
+    # Correction = rotation that takes modified_ws back to starting_ws
+    q_diff = quat_rotation_difference(modified_ws_quat, starting_ws_quat)
+    euler = quat_to_euler(q_diff, order="XYZ")
+    item.correction_factor = euler.copy()
+    item.quat_correction_factor = q_diff.copy()
+    # Restore
+    dst_bone.pose_rotation_quat = snap_r
+    dst_bone.pose_location = snap_t
+    parent = dst_bone.parent
+    dst_bone._fk(parent.matrix_armature if parent else np.eye(4))
+def calc_location_offset(
+    item: BoneMappingItem,
+    src_arm: Armature,
+    dst_arm: Armature,
+) -> None:
+    """
+    Auto-compute position correction for one bone mapping.
+    Port of Blender's CalcLocationOffset().
+    """
+    if not item.source_bone_name or not item.destination_bone_name:
+        return
+    if not src_arm.has_bone(item.source_bone_name):
+        return
+    if not dst_arm.has_bone(item.destination_bone_name):
+        return
+    src_bone = src_arm.get_bone(item.source_bone_name)
+    dst_bone = dst_arm.get_bone(item.destination_bone_name)
+    source_ws_pos = get_bone_position_ws(src_bone, src_arm)
+    dest_ws_pos = get_bone_position_ws(dst_bone, dst_arm)
+    # Snapshot
+    snap_loc = dst_bone.pose_location.copy()
+    # Move dest to source position
+    set_bone_position_ws(dst_bone, dst_arm, source_ws_pos)
+    dst_arm.update_fk()
+    moved_pose_loc = dst_bone.pose_location.copy()
+    # Restore
+    set_bone_position_ws(dst_bone, dst_arm, dest_ws_pos)
+    dst_arm.update_fk()
+    delta = snap_loc - moved_pose_loc
+    item.position_correction_factor = delta.copy()
+def calc_all_corrections(
+    bone_items: List[BoneMappingItem],
+    src_arm: Armature,
+    dst_arm: Armature,
+    settings: KeeMapSettings,
+) -> None:
+    """Auto-calculate rotation and position corrections for all mapped bones."""
+    for item in bone_items:
+        calc_rotation_offset(item, src_arm, dst_arm, settings)
+        if "pole" not in item.name.lower():
+            calc_location_offset(item, src_arm, dst_arm)
+# ---------------------------------------------------------------------------
+# Single-frame transfer
+# ---------------------------------------------------------------------------
+def transfer_frame(
+    src_arm: Armature,
+    dst_arm: Armature,
+    bone_items: List[BoneMappingItem],
+    settings: KeeMapSettings,
+) -> Dict[str, Tuple[np.ndarray, np.ndarray, np.ndarray]]:
+    """
+    Apply retargeting for all bone mappings at the current source frame.
+    src_arm must already have FK updated for the current frame.
+    Returns a dict of  bone_name → (pose_location, pose_rotation_quat, pose_scale)
+    suitable for writing into a keyframe list.
+    """
+    for item in bone_items:
+        if not item.source_bone_name or not item.destination_bone_name:
+            continue
+        if not src_arm.has_bone(item.source_bone_name):
+            continue
+        if not dst_arm.has_bone(item.destination_bone_name):
+            continue
+        # Build correction quaternion
+        if settings.bone_rotation_mode == "EULER":
+            cf = item.correction_factor
+            correction_quat = euler_to_quat(cf[0], cf[1], cf[2], order="XYZ")
+        else:
+            correction_quat = quat_normalize(item.quat_correction_factor)
+        # Rotation
+        if item.set_bone_rotation:
+            set_bone_rotation(
+                src_arm, item.source_bone_name,
+                dst_arm, item.destination_bone_name,
+                item.twist_bone_name,
+                correction_quat,
+                item.has_twist_bone,
+                item.bone_rotation_application_axis,
+                item.bone_transpose_axis,
+                settings.bone_rotation_mode,
+            )
+            dst_arm.update_fk()
+        # Position
+        if item.set_bone_position:
+            if item.postion_type == "SINGLE_BONE_OFFSET":
+                set_bone_position(
+                    src_arm, item.source_bone_name,
+                    dst_arm, item.destination_bone_name,
+                    item.twist_bone_name,
+                    item.position_correction_factor,
+                    item.position_gain,
+                )
+            else:
+                set_bone_position_pole(
+                    src_arm, item.source_bone_name,
+                    dst_arm, item.destination_bone_name,
+                    item.twist_bone_name,
+                    -item.position_pole_distance,
+                )
+            dst_arm.update_fk()
+        # Scale
+        if item.set_bone_scale and item.scale_secondary_bone_name:
+            if src_arm.has_bone(item.scale_secondary_bone_name):
+                set_bone_scale(
+                    src_arm, item.source_bone_name,
+                    dst_arm, item.destination_bone_name,
+                    item.scale_secondary_bone_name,
+                    item.scale_gain,
+                    item.bone_scale_application_axis,
+                    item.scale_max,
+                    item.scale_min,
+                )
+                dst_arm.update_fk()
+    # Snapshot destination bone state for this frame
+    result: Dict[str, Tuple[np.ndarray, np.ndarray, np.ndarray]] = {}
+    for item in bone_items:
+        if not item.destination_bone_name:
+            continue
+        if not dst_arm.has_bone(item.destination_bone_name):
+            continue
+        dst_bone = dst_arm.get_bone(item.destination_bone_name)
+        result[item.destination_bone_name] = (
+            dst_bone.pose_location.copy(),
+            dst_bone.pose_rotation_quat.copy(),
+            dst_bone.pose_scale.copy(),
+        )
+    return result
+# ---------------------------------------------------------------------------
+# Full animation transfer
+# ---------------------------------------------------------------------------
+def transfer_animation(
+    src_anim,                        # BVHAnimation or any object with .armature + .apply_frame(i) + .num_frames
+    dst_arm: Armature,
+    bone_items: List[BoneMappingItem],
+    settings: KeeMapSettings,
+) -> List[Dict[str, Tuple[np.ndarray, np.ndarray, np.ndarray]]]:
+    """
+    Transfer all frames from src_anim to dst_arm.
+    Returns list of keyframe dicts (one per frame sampled).
+    Equivalent to Blender's PerformAnimationTransfer operator.
+    """
+    keyframes: List[Dict] = []
+    step = max(1, settings.keyframe_every_n_frames)
+    start = settings.start_frame_to_apply
+    total = settings.number_of_frames_to_apply
+    end = start + total
+    src_arm = src_anim.armature
+    i = start
+    n_steps = len(range(start, end, step))
+    step_i = 0
+    while i < end and i < src_anim.num_frames:
+        src_anim.apply_frame(i)    # updates src_arm FK
+        dst_arm.update_fk()
+        frame_data = transfer_frame(src_arm, dst_arm, bone_items, settings)
+        keyframes.append(frame_data)
+        step_i += 1
+        _update_progress("Retargeting", step_i / n_steps)
+        i += step
+    _update_progress("Retargeting", 1.0)
+    return keyframes

Retarget/search.py ADDED Viewed

	@@ -0,0 +1,159 @@

+"""
+search.py
+Stream TeoGchx/HumanML3D from HuggingFace and match motions by keyword.
+Dataset: https://huggingface.co/datasets/TeoGchx/HumanML3D
+  Format: motion column is [T, 263] inline in parquet (standard HumanML3D)
+  Splits: train (23 384), val (1 460), test (4 384)
+Usage
+-----
+    from Retarget.search import search_motions
+    results = search_motions("a person walks forward", top_k=5)
+    for r in results:
+        print(r["caption"], r["frames"], "frames")
+        # r["motion"]  →  np.ndarray [T, 263]
+"""
+from __future__ import annotations
+import re
+from typing import List, Optional
+import numpy as np
+# ─────────────────────────────────────────────────────────────────────────────
+# Caption cleaning
+# ─────────────────────────────────────────────────────────────────────────────
+_SEP = re.compile(r'#|\|')
+_POS_TAG = re.compile(r'^(?:[A-Z]{1,4}\s*)+$')   # lines that look like POS tags
+def _clean_caption(raw: str) -> str:
+    """
+    HumanML3D captions are stored as multiple sentences joined by '#',
+    sometimes followed by POS tag strings.  Return the first human-readable
+    sentence.
+    """
+    parts = _SEP.split(raw)
+    for part in parts:
+        part = part.strip()
+        if not part:
+            continue
+        words = part.split()
+        # Skip if >50 % of tokens look like POS tags (all-caps, ≤4 chars)
+        pos_count = sum(1 for w in words if w.isupper() and len(w) <= 4)
+        if len(words) > 0 and pos_count / len(words) < 0.5:
+            return part
+    return parts[0].strip() if parts else raw.strip()
+# ─────────────────────────────────────────────────────────────────────────────
+# Search
+# ─────────────────────────────────────────────────────────────────────────────
+def search_motions(
+    query: str,
+    top_k: int = 8,
+    split: str = "test",
+    max_scan: int = 4384,
+    cached: bool = False,
+) -> List[dict]:
+    """
+    Stream TeoGchx/HumanML3D and return up to top_k motions matching query.
+    Parameters
+    ----------
+    query       Natural-language description, e.g. "a person walks forward"
+    top_k       Maximum number of results to return
+    split       Dataset split — "test" (4 384 rows) is fastest to stream
+    max_scan    Hard cap on rows examined before returning
+    Returns
+    -------
+    List of dicts, sorted by relevance score (descending):
+        caption  str           clean human-readable description
+        motion   np.ndarray    shape [T, 263], standard HumanML3D features
+        frames   int           number of frames (T)
+        duration float         duration in seconds (at 20 fps)
+        name     str           original clip ID from dataset
+        score    int           keyword match score
+    """
+    try:
+        from datasets import load_dataset
+    except ImportError:
+        raise ImportError(
+            "pip install datasets   (HuggingFace datasets library required)"
+        )
+    if cached:
+        # Downloads the split once (~400MB) and caches to ~/.cache/huggingface.
+        # Subsequent calls are instant.  Use for local dev / testing.
+        ds = load_dataset("TeoGchx/HumanML3D", split=split)
+    else:
+        # Streaming: no disk cache, re-downloads each run. Good for server use.
+        ds = load_dataset("TeoGchx/HumanML3D", split=split, streaming=True)
+    # Tokenise query; remove punctuation
+    query_words = re.sub(r"[^\w\s]", "", query.lower()).split()
+    if not query_words:
+        return []
+    results: List[dict] = []
+    scanned = 0
+    for row in ds:
+        if scanned >= max_scan:
+            break
+        scanned += 1
+        caption_raw = row.get("caption", "") or ""
+        caption_clean = _clean_caption(caption_raw)
+        caption_lower = caption_clean.lower()
+        # Score: word-boundary matches count 2, substring matches count 1
+        score = 0
+        for kw in query_words:
+            if kw in caption_lower:
+                if re.search(r"\b" + re.escape(kw) + r"\b", caption_lower):
+                    score += 2
+                else:
+                    score += 1
+        if score == 0:
+            continue
+        motion_raw = row.get("motion")
+        if motion_raw is None:
+            continue
+        motion = np.array(motion_raw, dtype=np.float32)  # [T, 263]
+        meta   = row.get("meta_data") or {}
+        T        = motion.shape[0]
+        frames   = int(meta.get("num_frames", T))
+        duration = float(meta.get("duration", T / 20.0))
+        results.append({
+            "caption":  caption_clean,
+            "motion":   motion,
+            "frames":   frames,
+            "duration": duration,
+            "name":     str(meta.get("name", "")),
+            "score":    score,
+        })
+        # Stop as soon as we have top_k results
+        if len(results) >= top_k:
+            break
+    results.sort(key=lambda x: -x["score"])
+    return results[:top_k]
+def format_choice_label(result: dict) -> str:
+    """Short label for Gradio Radio component."""
+    caption = result["caption"]
+    if len(caption) > 72:
+        caption = caption[:72] + "…"
+    return f"{caption}  ({result['frames']} frames, {result['duration']:.1f}s)"

Retarget/skeleton.py ADDED Viewed

	@@ -0,0 +1,165 @@

+"""
+skeleton.py
+Pure-Python armature / pose-bone system.
+Design matches Blender's pose-mode semantics:
+  - bone.rest_matrix_local   = 4×4 rest pose in parent space  (edit-mode)
+  - bone.pose_rotation_quat  = local rotation DELTA from rest  (≡ bone.rotation_quaternion)
+  - bone.pose_location       = local translation DELTA from rest (≡ bone.location)
+  - bone.pose_scale          = local scale                       (≡ bone.scale)
+  - bone.matrix_armature     = FK-computed 4×4 in armature space (≡ bone.matrix in pose mode)
+Armature.world_matrix corresponds to arm.matrix_world.
+"""
+from __future__ import annotations
+import numpy as np
+from typing import Dict, List, Optional, Tuple
+from .math3d import (
+    quat_identity, quat_normalize, quat_mul,
+    quat_to_matrix4, matrix4_to_quat,
+    translation_matrix, scale_matrix, trs_to_matrix4, matrix4_to_trs,
+    vec3,
+)
+class PoseBone:
+    def __init__(
+        self,
+        name: str,
+        rest_matrix_local: np.ndarray,   # 4×4, in parent local space
+        parent: Optional["PoseBone"] = None,
+    ):
+        self.name = name
+        self.parent: Optional[PoseBone] = parent
+        self.children: List[PoseBone] = []
+        self.rest_matrix_local: np.ndarray = rest_matrix_local.copy()
+        # Pose state — start at rest (delta = identity)
+        self.pose_rotation_quat: np.ndarray = quat_identity()
+        self.pose_location: np.ndarray = vec3()
+        self.pose_scale: np.ndarray = np.ones(3)
+        # Cached FK result — call armature.update_fk() to refresh
+        self._matrix_armature: np.ndarray = np.eye(4)
+    # -----------------------------------------------------------------------
+    # Properties
+    # -----------------------------------------------------------------------
+    @property
+    def matrix_armature(self) -> np.ndarray:
+        """4×4 FK result in armature space.  Refresh with armature.update_fk()."""
+        return self._matrix_armature
+    @property
+    def head(self) -> np.ndarray:
+        """Bone head position in armature space."""
+        return self._matrix_armature[:3, 3].copy()
+    @property
+    def tail(self) -> np.ndarray:
+        """
+        Approximate tail position (Y-axis in bone space, length 1).
+        Works for Y-along-bone convention (Blender / BVH default).
+        """
+        y_axis = self._matrix_armature[:3, :3] @ np.array([0.0, 1.0, 0.0])
+        return self._matrix_armature[:3, 3] + y_axis
+    # -----------------------------------------------------------------------
+    # FK
+    # -----------------------------------------------------------------------
+    def _compute_local_matrix(self) -> np.ndarray:
+        """rest_local @ T(pose_loc) @ R(pose_rot) @ S(pose_scale)."""
+        T = translation_matrix(self.pose_location)
+        R = quat_to_matrix4(self.pose_rotation_quat)
+        S = scale_matrix(self.pose_scale)
+        return self.rest_matrix_local @ T @ R @ S
+    def _fk(self, parent_matrix: np.ndarray) -> None:
+        self._matrix_armature = parent_matrix @ self._compute_local_matrix()
+        for child in self.children:
+            child._fk(self._matrix_armature)
+    # -----------------------------------------------------------------------
+    # Parent-chain helpers (Blender: bone.parent_recursive)
+    # -----------------------------------------------------------------------
+    @property
+    def parent_recursive(self) -> List["PoseBone"]:
+        chain: List[PoseBone] = []
+        cur = self.parent
+        while cur is not None:
+            chain.append(cur)
+            cur = cur.parent
+        return chain
+class Armature:
+    """
+    Collection of PoseBones with a world transform.
+    Corresponds to a Blender armature object.
+    """
+    def __init__(self, name: str = "Armature"):
+        self.name = name
+        self.world_matrix: np.ndarray = np.eye(4)    # arm.matrix_world
+        self._bones: Dict[str, PoseBone] = {}
+        self._roots: List[PoseBone] = []
+    # -----------------------------------------------------------------------
+    # Construction helpers
+    # -----------------------------------------------------------------------
+    def add_bone(self, bone: PoseBone, parent_name: Optional[str] = None) -> PoseBone:
+        self._bones[bone.name] = bone
+        if parent_name and parent_name in self._bones:
+            parent = self._bones[parent_name]
+            bone.parent = parent
+            parent.children.append(bone)
+        elif bone.parent is None:
+            self._roots.append(bone)
+        return bone
+    @property
+    def pose_bones(self) -> Dict[str, PoseBone]:
+        return self._bones
+    def get_bone(self, name: str) -> PoseBone:
+        if name not in self._bones:
+            raise KeyError(f"Bone '{name}' not found in armature '{self.name}'")
+        return self._bones[name]
+    def has_bone(self, name: str) -> bool:
+        return name in self._bones
+    # -----------------------------------------------------------------------
+    # FK update
+    # -----------------------------------------------------------------------
+    def update_fk(self) -> None:
+        """Recompute all bone armature-space matrices via FK."""
+        for root in self._roots:
+            root._fk(np.eye(4))
+    # -----------------------------------------------------------------------
+    # Snapshot / restore (for calc-correction passes)
+    # -----------------------------------------------------------------------
+    def snapshot(self) -> Dict[str, Tuple[np.ndarray, np.ndarray, np.ndarray]]:
+        return {
+            name: (
+                bone.pose_rotation_quat.copy(),
+                bone.pose_location.copy(),
+                bone.pose_scale.copy(),
+            )
+            for name, bone in self._bones.items()
+        }
+    def restore(self, snap: Dict[str, Tuple[np.ndarray, np.ndarray, np.ndarray]]) -> None:
+        for name, (r, t, s) in snap.items():
+            if name in self._bones:
+                self._bones[name].pose_rotation_quat = r.copy()
+                self._bones[name].pose_location = t.copy()
+                self._bones[name].pose_scale = s.copy()
+        self.update_fk()

Retarget/smpl.py ADDED Viewed

	@@ -0,0 +1,184 @@

+"""
+smpl.py
+───────────────────────────────────────────────────────────────────────────────
+Parse HumanML3D [T, 263] feature vectors into structured SMPL motion data.
+HumanML3D 263-dim layout per frame
+  [0]       root angular-velocity   (Y-axis, rad/frame)
+  [1]       root height Y           (metres)
+  [2:4]     root XZ velocity        (local-frame, metres/frame)
+  [4:67]    joint local positions   joints 1-21 relative to root, 21×3 (unused here)
+  [67:193]  6D joint rotations      joints 1-21, 21×6
+  [193:259] joint velocities        joints 0-21, 22×3 (unused here)
+  [259:263] foot contact flags      (unused here)
+Root rotation  = cumulative integral of dim[0] → Y-axis quaternion.
+Root position  = dim[1] (height) + integrated XZ velocity.
+Joint 1-21 rot = dims 67:193 as 6D continuous rotation representation
+                 [Zhou et al. 2019] → Gram-Schmidt → 3×3 rotation matrix → quaternion.
+                 These are LOCAL rotations relative to the SMPL parent joint's rest
+                 frame, where the canonical T-pose is the zero (identity) rotation.
+"""
+from __future__ import annotations
+import numpy as np
+# ──────────────────────────────────────────────────────────────────────────────
+# 6D rotation helpers
+# ──────────────────────────────────────────────────────────────────────────────
+def rot6d_to_matrix(r6d: np.ndarray) -> np.ndarray:
+    """
+    [..., 6] → [..., 3, 3]
+    Reconstructs a rotation matrix from two columns using Gram-Schmidt.
+    The two columns are [a1 = r6d[..., 0:3], a2 = r6d[..., 3:6]].
+    """
+    a1 = r6d[..., 0:3].astype(np.float64)
+    a2 = r6d[..., 3:6].astype(np.float64)
+    b1 = a1 / (np.linalg.norm(a1, axis=-1, keepdims=True) + 1e-12)
+    b2 = a2 - (b1 * a2).sum(axis=-1, keepdims=True) * b1
+    b2 = b2 / (np.linalg.norm(b2, axis=-1, keepdims=True) + 1e-12)
+    b3 = np.cross(b1, b2)
+    return np.stack([b1, b2, b3], axis=-1)   # columns → [..., 3, 3]
+def matrix_to_quat(mat: np.ndarray) -> np.ndarray:
+    """
+    [..., 3, 3] → [..., 4]  WXYZ quaternion, positive-W convention.
+    Uses scipy for numerical stability.
+    """
+    from scipy.spatial.transform import Rotation
+    shape = mat.shape[:-2]
+    flat  = mat.reshape(-1, 3, 3).astype(np.float64)
+    xyzw  = Rotation.from_matrix(flat).as_quat()   # scipy → XYZW
+    wxyz  = xyzw[:, [3, 0, 1, 2]].astype(np.float32)
+    wxyz[wxyz[:, 0] < 0] *= -1                     # positive-W
+    return wxyz.reshape(*shape, 4)
+def rot6d_to_quat(r6d: np.ndarray) -> np.ndarray:
+    """[..., 6] → [..., 4] WXYZ.  Convenience: 6D → matrix → quaternion."""
+    return matrix_to_quat(rot6d_to_matrix(r6d))
+# ──────────────────────────────────────────────────────────────────────────────
+# Root motion recovery
+# ──────────────────────────────────────────────────────────────────────────────
+def _qrot_vec(q: np.ndarray, v: np.ndarray) -> np.ndarray:
+    """Rotate [N, 3] vectors by [N, 4] WXYZ quaternions (batch)."""
+    w, x, y, z = q[:, 0:1], q[:, 1:2], q[:, 2:3], q[:, 3:4]
+    vx, vy, vz  = v[:, 0:1], v[:, 1:2], v[:, 2:3]
+    # Rodrigues-style: v + 2w*(q.xyz × v) + 2*(q.xyz × (q.xyz × v))
+    tx = 2 * (y * vz - z * vy)
+    ty = 2 * (z * vx - x * vz)
+    tz = 2 * (x * vy - y * vx)
+    return np.concatenate([
+        vx + w * tx + y * tz - z * ty,
+        vy + w * ty + z * tx - x * tz,
+        vz + w * tz + x * ty - y * tx,
+    ], axis=-1)
+def recover_root_motion(data: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
+    """
+    Recover root world-space position and rotation from [T, 263] features.
+    Returns
+    -------
+    root_pos : [T, 3]  world-space root position  (Y = height above ground)
+    root_rot : [T, 4]  WXYZ quaternion — Y-axis only (global facing direction)
+    """
+    T = data.shape[0]
+    # Facing direction: integrate Y-axis angular velocity
+    theta    = np.cumsum(data[:, 0].astype(np.float32))
+    half     = theta * 0.5
+    root_rot = np.zeros((T, 4), dtype=np.float32)
+    root_rot[:, 0] = np.cos(half)
+    root_rot[:, 2] = np.sin(half)
+    # XZ velocity encoded in root-local frame → world frame
+    vel_local = np.stack([
+        data[:, 2].astype(np.float32),
+        np.zeros(T, dtype=np.float32),
+        data[:, 3].astype(np.float32),
+    ], axis=-1)
+    vel_world = _qrot_vec(root_rot, vel_local)
+    root_pos = np.zeros((T, 3), dtype=np.float32)
+    root_pos[:, 0] = np.cumsum(vel_world[:, 0])
+    root_pos[:, 1] = data[:, 1]
+    root_pos[:, 2] = np.cumsum(vel_world[:, 2])
+    return root_pos, root_rot
+# ──────────────────────────────────────────────────────────────────────────────
+# SMPLMotion container
+# ──────────────────────────────────────────────────────────────────────────────
+class SMPLMotion:
+    """
+    Structured SMPL motion data parsed from a single HumanML3D clip.
+    Attributes
+    ----------
+    root_pos  : [T, 3]     world-space root position  (metres)
+    root_rot  : [T, 4]     WXYZ root Y-axis rotation  (global facing)
+    local_rot : [T, 21, 4] WXYZ local quaternions for joints 1-21
+                            T-pose = identity; relative to SMPL parent frame
+    fps       : float      capture frame rate (20 for HumanML3D)
+    """
+    def __init__(
+        self,
+        root_pos:  np.ndarray,
+        root_rot:  np.ndarray,
+        local_rot: np.ndarray,
+        fps: float = 20.0,
+    ):
+        self.root_pos  = np.asarray(root_pos,  dtype=np.float32)
+        self.root_rot  = np.asarray(root_rot,  dtype=np.float32)
+        self.local_rot = np.asarray(local_rot, dtype=np.float32)
+        self.fps       = float(fps)
+    @property
+    def num_frames(self) -> int:
+        return self.root_pos.shape[0]
+    def slice(self, start: int = 0, end: int = -1) -> "SMPLMotion":
+        e = end if end > 0 else self.num_frames
+        return SMPLMotion(
+            self.root_pos[start:e],
+            self.root_rot[start:e],
+            self.local_rot[start:e],
+            self.fps,
+        )
+def hml3d_to_smpl_motion(data: np.ndarray, fps: float = 20.0) -> SMPLMotion:
+    """
+    Convert HumanML3D [T, 263] feature array to a SMPLMotion.
+    Uses the actual 6D rotation data (dims 67:193) — NOT position-derived
+    rotations.  This preserves twist and gives physically correct limb poses.
+    Parameters
+    ----------
+    data : [T, 263]  raw HumanML3D features (e.g. from MoMask or dataset row)
+    fps  : float     frame rate (default 20 = HumanML3D native)
+    """
+    data = np.asarray(data, dtype=np.float32)
+    if data.ndim != 2 or data.shape[1] < 193:
+        raise ValueError(f"Expected [T, >=193] but got {data.shape}")
+    T = data.shape[0]
+    root_pos, root_rot = recover_root_motion(data)
+    # 6D rotations for joints 1-21:  dims [67:193]  →  [T, 21, 6]
+    r6d       = data[:, 67:193].reshape(T, 21, 6)
+    local_rot = rot6d_to_quat(r6d)              # [T, 21, 4]  WXYZ
+    return SMPLMotion(root_pos, root_rot, local_rot, fps)

app.py CHANGED Viewed

@@ -6,6 +6,7 @@ import shutil
 import traceback
 import json
 import random
 from pathlib import Path
 # ── ZeroGPU: install packages that can't be built at Docker build time ─────────
@@ -130,8 +131,13 @@ _triposg_pipe  = None
 _rmbg_net      = None
 _rmbg_version  = None
 _last_glb_path = None
 _init_seed     = random.randint(0, 2**31 - 1)
 ARCFACE_256 = (np.array([[38.2946, 51.6963], [73.5318, 51.5014], [56.0252, 71.7366],
                           [41.5493, 92.3655], [70.7299, 92.2041]], dtype=np.float32)
                * (256 / 112) + (256 - 112 * (256 / 112)) / 2)
@@ -167,6 +173,104 @@ def _ensure_ckpts():
 # ── Model loaders ─────────────────────────────────────────────────────────────
 def load_triposg():
     global _triposg_pipe, _rmbg_net, _rmbg_version
     if _triposg_pipe is not None:
@@ -309,6 +413,197 @@ def load_triposg():
     return _triposg_pipe, _rmbg_net
 # ── Background removal helper ─────────────────────────────────────────────────
 def _remove_bg_rmbg(img_pil, threshold=0.5, erode_px=2):
@@ -343,8 +638,8 @@ def _remove_bg_rmbg(img_pil, threshold=0.5, erode_px=2):
     rgb   = np.array(img_pil.convert("RGB"), dtype=np.float32) / 255.0
     alpha = mask[:, :, np.newaxis]
-    comp  = (rgb * alpha + 0.5 * (1.0 - alpha) * 255).clip(0, 255).astype(np.uint8)
-    return Image.fromarray(comp)
 def preview_rembg(input_image, do_remove_bg, threshold, erode_px):
@@ -357,6 +652,188 @@ def preview_rembg(input_image, do_remove_bg, threshold, erode_px):
         return input_image
 # ── Stage 1: Shape generation ─────────────────────────────────────────────────
 @spaces.GPU(duration=180)
@@ -365,6 +842,28 @@ def generate_shape(input_image, remove_background, num_steps, guidance_scale,
     if input_image is None:
         return None, "Please upload an image."
     try:
         progress(0.1, desc="Loading TripoSG...")
         pipe, rmbg_net = load_triposg()
@@ -373,16 +872,17 @@ def generate_shape(input_image, remove_background, num_steps, guidance_scale,
         img.save(img_path)
         progress(0.5, desc="Generating shape (SDF diffusion)...")
-        from scripts.inference_triposg import run_triposg
-        mesh = run_triposg(
-            pipe=pipe,
-            image_input=img_path,
-            rmbg_net=rmbg_net if remove_background else None,
-            seed=int(seed),
-            num_inference_steps=int(num_steps),
-            guidance_scale=float(guidance_scale),
-            faces=int(face_count) if int(face_count) > 0 else -1,
-        )
         out_path = "/tmp/triposg_shape.glb"
         mesh.export(out_path)
@@ -609,7 +1109,7 @@ def gradio_rig(glb_state_path, export_fbx_flag, mdm_prompt, mdm_n_frames,
             animated = mdm_result.get("animated_glb")
         parts = ["Rigged: " + os.path.basename(rigged)]
-        if fbx:     parts.append("FBX: " + os.path.basename(fbx))
         if animated: parts.append("Animation: " + os.path.basename(animated))
         torch.cuda.empty_cache()
@@ -633,6 +1133,7 @@ def gradio_enhance(glb_path, ref_img_np, do_normal, norm_res, norm_strength,
         from pipeline.enhance_surface import (
             run_stable_normal, run_depth_anything,
             bake_normal_into_glb, bake_depth_as_occlusion,
         )
         import pipeline.enhance_surface as _enh_mod
@@ -704,18 +1205,271 @@ def render_views(glb_file):
         return []
 # ── Full pipeline ─────────────────────────────────────────────────────────────
-def run_full_pipeline(input_image, num_steps, guidance, seed, face_count,
-                      variant, tex_seed, enhance_face,
                       export_fbx, mdm_prompt, mdm_n_frames, progress=gr.Progress()):
     progress(0.0, desc="Stage 1/3: Generating shape...")
-    glb, status = generate_shape(input_image, True, num_steps, guidance, seed, face_count)
     if not glb:
         return None, None, None, None, None, None, status
     progress(0.33, desc="Stage 2/3: Applying texture...")
-    glb, mv_img, status = apply_texture(glb, input_image, True, variant, tex_seed, enhance_face)
     if not glb:
         return None, None, None, None, None, None, status
@@ -727,17 +1481,61 @@ def run_full_pipeline(input_image, num_steps, guidance, seed, face_count,
 # ── UI ────────────────────────────────────────────────────────────────────────
-with gr.Blocks(title="Image2Model") as demo:
     gr.Markdown("# Image2Model — Portrait to Rigged 3D Mesh")
-    glb_state = gr.State(None)
-    with gr.Tabs():
         # ════════════════════════════════════════════════════════════════════
-        with gr.Tab("Generate"):
             with gr.Row():
                 with gr.Column(scale=1):
-                    input_image    = gr.Image(label="Input Image", type="numpy")
                     with gr.Accordion("Shape Settings", open=True):
                         num_steps  = gr.Slider(20, 100, value=50, step=5,  label="Inference Steps")
@@ -756,9 +1554,11 @@ with gr.Blocks(title="Image2Model") as demo:
                         shape_btn   = gr.Button("Generate Shape",  variant="primary",   scale=2, interactive=False)
                         texture_btn = gr.Button("Apply Texture",   variant="secondary", scale=2)
                         render_btn  = gr.Button("Render Views",    variant="secondary", scale=1)
-                    run_all_btn = gr.Button("▶ Run Full Pipeline", variant="primary", interactive=False)
                 with gr.Column(scale=1):
                     status         = gr.Textbox(label="Status", lines=3, interactive=False)
                     model_3d       = gr.Model3D(label="3D Preview", clear_color=[0.9, 0.9, 0.9, 1.0])
                     download_file  = gr.File(label="Download GLB")
@@ -766,6 +1566,8 @@ with gr.Blocks(title="Image2Model") as demo:
             render_gallery = gr.Gallery(label="Rendered Views", columns=5, height=300)
             _pipeline_btns = [shape_btn, run_all_btn]
             input_image.upload(
@@ -777,9 +1579,14 @@ with gr.Blocks(title="Image2Model") as demo:
                 inputs=[], outputs=_pipeline_btns,
             )
             shape_btn.click(
-                fn=lambda img, ns, gs, sd, fc: generate_shape(img, True, ns, gs, sd, fc),
-                inputs=[input_image, num_steps, guidance, seed, face_count],
                 outputs=[glb_state, status],
             ).then(
                 fn=lambda p: (p, p) if p else (None, None),
@@ -787,8 +1594,9 @@ with gr.Blocks(title="Image2Model") as demo:
             )
             texture_btn.click(
-                fn=lambda glb, img, v, ts, ef: apply_texture(glb, img, True, v, ts, ef),
-                inputs=[glb_state, input_image, variant, tex_seed, enhance_face_check],
                 outputs=[glb_state, multiview_img, status],
             ).then(
                 fn=lambda p: (p, p) if p else (None, None),
@@ -797,6 +1605,29 @@ with gr.Blocks(title="Image2Model") as demo:
             render_btn.click(fn=render_views, inputs=[download_file], outputs=[render_gallery])
         # ════════════════════════════════════════════════════════════════════
         with gr.Tab("Rig & Export"):
             with gr.Row():
@@ -844,6 +1675,9 @@ with gr.Blocks(title="Image2Model") as demo:
                 inputs=[glb_state, export_fbx_check, mdm_prompt_box, mdm_frames_slider],
                 outputs=[rig_glb_dl, rig_animated_dl, rig_fbx_dl, rig_status,
                          rig_model_3d, rigged_base_state, skel_glb_state],
             )
             show_skel_check.change(
@@ -852,6 +1686,103 @@ with gr.Blocks(title="Image2Model") as demo:
                 outputs=[rig_model_3d],
             )
         # ════════════════════════════════════════════════════════════════════
         with gr.Tab("Enhancement"):
             gr.Markdown("**Surface Enhancement** — bakes normal + depth maps into the GLB as PBR textures.")
@@ -868,6 +1799,7 @@ with gr.Blocks(title="Image2Model") as demo:
                     displacement_scale = gr.Slider(0.1, 3.0, value=1.0, step=0.1, label="Displacement Scale")
                     enhance_btn = gr.Button("Run Enhancement", variant="primary")
                 with gr.Column(scale=2):
                     enhance_status    = gr.Textbox(label="Status", lines=5, interactive=False)
@@ -886,12 +1818,111 @@ with gr.Blocks(title="Image2Model") as demo:
                          enhanced_glb_dl, enhanced_model_3d, enhance_status],
             )
-        # ── Run All wiring ────────────────────────────────────────────────
         run_all_btn.click(
             fn=run_full_pipeline,
             inputs=[
-                input_image, num_steps, guidance, seed, face_count,
-                variant, tex_seed, enhance_face_check,
                 export_fbx_check, mdm_prompt_box, mdm_frames_slider,
             ],
             outputs=[glb_state, download_file, multiview_img,
@@ -901,6 +1932,23 @@ with gr.Blocks(title="Image2Model") as demo:
             inputs=[glb_state], outputs=[model_3d, download_file],
         )
 if __name__ == "__main__":
-    demo.launch(server_name="0.0.0.0", server_port=7860, theme=gr.themes.Soft())

 import traceback
 import json
 import random
+import threading
 from pathlib import Path
 # ── ZeroGPU: install packages that can't be built at Docker build time ─────────
 _rmbg_net      = None
 _rmbg_version  = None
 _last_glb_path = None
+_hyperswap_sess = None
+_gfpgan_restorer = None
+_firered_pipe  = None
 _init_seed     = random.randint(0, 2**31 - 1)
+_model_load_lock = threading.Lock()
 ARCFACE_256 = (np.array([[38.2946, 51.6963], [73.5318, 51.5014], [56.0252, 71.7366],
                           [41.5493, 92.3655], [70.7299, 92.2041]], dtype=np.float32)
                * (256 / 112) + (256 - 112 * (256 / 112)) / 2)
 # ── Model loaders ─────────────────────────────────────────────────────────────
+def _load_rmbg():
+    """Load RMBG-2.0 from 1038lab mirror."""
+    global _rmbg_net, _rmbg_version
+    if _rmbg_net is not None:
+        return
+    try:
+        from transformers import AutoModelForImageSegmentation
+        from torch.overrides import TorchFunctionMode
+        class _NoMetaMode(TorchFunctionMode):
+            """Intercept device='meta' tensor construction and redirect to CPU.
+            init_empty_weights() inside from_pretrained pushes a meta DeviceContext
+            ON TOP of any torch.device("cpu") wrapper, so meta wins. This mode is
+            pushed BELOW it; when meta DeviceContext adds device='meta' and chains
+            down the stack, we see it here and flip it back to 'cpu'.
+            """
+            def __torch_function__(self, func, types, args=(), kwargs=None):
+                if kwargs is None:
+                    kwargs = {}
+                dev = kwargs.get("device")
+                if dev is not None:
+                    dev_str = dev.type if isinstance(dev, torch.device) else str(dev)
+                    if dev_str == "meta":
+                        kwargs["device"] = "cpu"
+                return func(*args, **kwargs)
+        # transformers 5.x _finalize_model_loading calls mark_tied_weights_as_initialized
+        # which accesses all_tied_weights_keys. BiRefNetConfig inherits from the old
+        # PretrainedConfig alias which skips the new PreTrainedModel.__init__ section
+        # that sets this attribute. Patch the method to be safe.
+        from transformers import PreTrainedModel as _PTM
+        _orig_mark_tied = _PTM.mark_tied_weights_as_initialized
+        def _safe_mark_tied(self, loading_info):
+            if not hasattr(self, "all_tied_weights_keys"):
+                self.all_tied_weights_keys = {}
+            return _orig_mark_tied(self, loading_info)
+        _PTM.mark_tied_weights_as_initialized = _safe_mark_tied
+        try:
+            with _NoMetaMode():
+                _rmbg_net = AutoModelForImageSegmentation.from_pretrained(
+                    "1038lab/RMBG-2.0", trust_remote_code=True, low_cpu_mem_usage=False,
+                )
+        finally:
+            _PTM.mark_tied_weights_as_initialized = _orig_mark_tied
+        _rmbg_net.to(DEVICE).eval()
+        _rmbg_version = "2.0"
+        print("RMBG-2.0 loaded.")
+    except Exception as e:
+        _rmbg_net = None
+        _rmbg_version = None
+        print(f"RMBG-2.0 failed: {e} — background removal disabled.")
+def load_rmbg_only():
+    """Load RMBG standalone without loading TripoSG."""
+    _load_rmbg()
+    return _rmbg_net
+def load_gfpgan():
+    global _gfpgan_restorer
+    if _gfpgan_restorer is not None:
+        return _gfpgan_restorer
+    try:
+        from gfpgan import GFPGANer
+        from basicsr.archs.rrdbnet_arch import RRDBNet
+        from realesrgan import RealESRGANer
+        model_path = str(CKPT_DIR / "GFPGANv1.4.pth")
+        if not os.path.exists(model_path):
+            print(f"[GFPGAN] Not found at {model_path}")
+            return None
+        realesrgan_path = str(CKPT_DIR / "RealESRGAN_x2plus.pth")
+        bg_upsampler = None
+        if os.path.exists(realesrgan_path):
+            bg_model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64,
+                               num_block=23, num_grow_ch=32, scale=2)
+            bg_upsampler = RealESRGANer(
+                scale=2, model_path=realesrgan_path, model=bg_model,
+                tile=400, tile_pad=10, pre_pad=0, half=True,
+            )
+            print("[GFPGAN] RealESRGAN x2plus bg_upsampler loaded")
+        else:
+            print("[GFPGAN] RealESRGAN_x2plus.pth not found, running without upsampler")
+        _gfpgan_restorer = GFPGANer(
+            model_path=model_path, upscale=2, arch="clean",
+            channel_multiplier=2, bg_upsampler=bg_upsampler,
+        )
+        print("[GFPGAN] Loaded GFPGANv1.4 (upscale=2 + RealESRGAN bg_upsampler)")
+        return _gfpgan_restorer
+    except Exception as e:
+        print(f"[GFPGAN] Load failed: {e}")
+        return None
 def load_triposg():
     global _triposg_pipe, _rmbg_net, _rmbg_version
     if _triposg_pipe is not None:
     return _triposg_pipe, _rmbg_net
+def load_firered():
+    """Lazy-load FireRed image-edit pipeline using GGUF-quantized transformer.
+    Transformer: loaded from GGUF via from_single_file (Q4_K_M, ~12 GB on disk).
+    Tries Arunk25/Qwen-Image-Edit-Rapid-AIO-GGUF first (fine-tuned, merged model).
+    Falls back to unsloth/Qwen-Image-Edit-2511-GGUF (base model) if key mapping fails.
+    text_encoder: 4-bit NF4 on GPU (~5.6 GB).
+    GGUF transformer: dequantized on-the-fly, dispatched with 18 GiB GPU budget.
+    Lightning scheduler: 4 steps, CFG 1.0 → ~1-2 min per inference.
+    GPU budget: ~18 GB transformer + ~5.6 GB text_encoder + ~0.3 GB VAE ≈ 24 GB.
+    """
+    global _firered_pipe
+    if _firered_pipe is not None:
+        return _firered_pipe
+    import math as _math
+    from diffusers import QwenImageEditPlusPipeline, FlowMatchEulerDiscreteScheduler, GGUFQuantizationConfig
+    from diffusers.models import QwenImageTransformer2DModel
+    from transformers import BitsAndBytesConfig, Qwen2_5_VLForConditionalGeneration
+    from accelerate import dispatch_model, infer_auto_device_map
+    from huggingface_hub import hf_hub_download
+    # Patch SDPA to cast K/V to match Q dtype.
+    import torch.nn.functional as _F
+    _orig_sdpa = _F.scaled_dot_product_attention
+    def _dtype_safe_sdpa(query, key, value, *a, **kw):
+        if key.dtype   != query.dtype: key   = key.to(query.dtype)
+        if value.dtype != query.dtype: value = value.to(query.dtype)
+        return _orig_sdpa(query, key, value, *a, **kw)
+    _F.scaled_dot_product_attention = _dtype_safe_sdpa
+    torch.cuda.empty_cache()
+    # Load RMBG NOW — before dispatch_model creates meta tensors that poison later loads
+    _load_rmbg()
+    gguf_config = GGUFQuantizationConfig(compute_dtype=torch.bfloat16)
+    # ── Transformer: GGUF Q4_K_M — try fine-tuned Rapid-AIO first, fall back to base ──
+    transformer = None
+    # Attempt 1: Arunk25 Rapid-AIO GGUF (fine-tuned, fully merged, ~12.4 GB)
+    try:
+        print("[FireRed] Downloading Arunk25/Qwen-Image-Edit-Rapid-AIO-GGUF Q4_K_M (~12 GB)...")
+        gguf_path = hf_hub_download(
+            repo_id="Arunk25/Qwen-Image-Edit-Rapid-AIO-GGUF",
+            filename="v23/Qwen-Rapid-AIO-NSFW-v23-Q4_K_M.gguf",
+        )
+        print("[FireRed] Loading Rapid-AIO transformer from GGUF...")
+        transformer = QwenImageTransformer2DModel.from_single_file(
+            gguf_path,
+            quantization_config=gguf_config,
+            torch_dtype=torch.bfloat16,
+            config="Qwen/Qwen-Image-Edit-2511",
+            subfolder="transformer",
+        )
+        print("[FireRed] Rapid-AIO GGUF transformer loaded OK.")
+    except Exception as e:
+        print(f"[FireRed] Rapid-AIO GGUF failed ({e}), falling back to unsloth base GGUF...")
+        transformer = None
+    # Attempt 2: unsloth base GGUF Q4_K_M (~12.3 GB)
+    if transformer is None:
+        print("[FireRed] Downloading unsloth/Qwen-Image-Edit-2511-GGUF Q4_K_M (~12 GB)...")
+        gguf_path = hf_hub_download(
+            repo_id="unsloth/Qwen-Image-Edit-2511-GGUF",
+            filename="qwen-image-edit-2511-Q4_K_M.gguf",
+        )
+        print("[FireRed] Loading base transformer from GGUF...")
+        transformer = QwenImageTransformer2DModel.from_single_file(
+            gguf_path,
+            quantization_config=gguf_config,
+            torch_dtype=torch.bfloat16,
+            config="Qwen/Qwen-Image-Edit-2511",
+            subfolder="transformer",
+        )
+        print("[FireRed] Base GGUF transformer loaded OK.")
+    print("[FireRed] Dispatching transformer (18 GiB GPU, rest CPU)...")
+    device_map = infer_auto_device_map(
+        transformer,
+        max_memory={0: "18GiB", "cpu": "90GiB"},
+        dtype=torch.bfloat16,
+    )
+    n_gpu = sum(1 for d in device_map.values() if str(d) in ("0", "cuda", "cuda:0"))
+    n_cpu = sum(1 for d in device_map.values() if str(d) == "cpu")
+    print(f"[FireRed] Dispatched: {n_gpu} modules on GPU, {n_cpu} on CPU")
+    transformer = dispatch_model(transformer, device_map=device_map)
+    used_mb = torch.cuda.memory_allocated() // (1024 ** 2)
+    print(f"[FireRed] Transformer dispatched — VRAM: {used_mb} MB")
+    # ── text_encoder: 4-bit NF4 on GPU (~5.6 GB) ──────────────────────────────
+    bnb_enc = BitsAndBytesConfig(
+        load_in_4bit=True,
+        bnb_4bit_quant_type="nf4",
+        bnb_4bit_compute_dtype=torch.bfloat16,
+        bnb_4bit_use_double_quant=True,
+    )
+    print("[FireRed] Loading text_encoder (4-bit NF4)...")
+    text_encoder = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+        "Qwen/Qwen-Image-Edit-2511",
+        subfolder="text_encoder",
+        quantization_config=bnb_enc,
+        device_map="auto",
+    )
+    used_mb = torch.cuda.memory_allocated() // (1024 ** 2)
+    print(f"[FireRed] Text encoder loaded — VRAM: {used_mb} MB")
+    # ── Pipeline: VAE + scheduler + processor + tokenizer ─────────────────────
+    print("[FireRed] Loading pipeline...")
+    _firered_pipe = QwenImageEditPlusPipeline.from_pretrained(
+        "Qwen/Qwen-Image-Edit-2511",
+        transformer=transformer,
+        text_encoder=text_encoder,
+        torch_dtype=torch.bfloat16,
+    )
+    _firered_pipe.vae.to(DEVICE)
+    # Lightning scheduler — 4 steps, use_dynamic_shifting, matches reference space config
+    _firered_pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config({
+        "base_image_seq_len": 256,
+        "base_shift": _math.log(3),
+        "max_image_seq_len": 8192,
+        "max_shift": _math.log(3),
+        "num_train_timesteps": 1000,
+        "shift": 1.0,
+        "time_shift_type": "exponential",
+        "use_dynamic_shifting": True,
+    })
+    used_mb = torch.cuda.memory_allocated() // (1024 ** 2)
+    print(f"[FireRed] Pipeline ready — total VRAM: {used_mb} MB")
+    return _firered_pipe
+def _gallery_to_pil_list(gallery_value):
+    """Convert a Gradio Gallery value (list of various formats) to a list of PIL Images."""
+    pil_images = []
+    if not gallery_value:
+        return pil_images
+    for item in gallery_value:
+        try:
+            if isinstance(item, np.ndarray):
+                pil_images.append(Image.fromarray(item).convert("RGB"))
+                continue
+            if isinstance(item, Image.Image):
+                pil_images.append(item.convert("RGB"))
+                continue
+            # Gradio 6 Gallery returns dicts: {"image": FileData, "caption": ...}
+            if isinstance(item, dict):
+                img_data = item.get("image") or item
+                if isinstance(img_data, dict):
+                    path = img_data.get("path") or img_data.get("url") or img_data.get("name")
+                else:
+                    path = img_data
+            elif isinstance(item, (list, tuple)):
+                path = item[0]
+            else:
+                path = item
+            if path and os.path.exists(str(path)):
+                pil_images.append(Image.open(str(path)).convert("RGB"))
+        except Exception as e:
+            print(f"[FireRed] Could not load gallery image: {e}")
+    return pil_images
+def _firered_resize(img):
+    """Resize to max 1024px maintaining aspect ratio, align dims to multiple of 8."""
+    w, h = img.size
+    if max(w, h) > 1024:
+        if w > h:
+            nw, nh = 1024, int(1024 * h / w)
+        else:
+            nw, nh = int(1024 * w / h), 1024
+    else:
+        nw, nh = w, h
+    nw, nh = max(8, (nw // 8) * 8), max(8, (nh // 8) * 8)
+    if (nw, nh) != (w, h):
+        img = img.resize((nw, nh), Image.LANCZOS)
+    return img
+_FIRERED_NEGATIVE = (
+    "worst quality, low quality, bad anatomy, bad hands, text, error, "
+    "missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, "
+    "signature, watermark, username, blurry"
+)
 # ── Background removal helper ─────────────────────────────────────────────────
 def _remove_bg_rmbg(img_pil, threshold=0.5, erode_px=2):
     rgb   = np.array(img_pil.convert("RGB"), dtype=np.float32) / 255.0
     alpha = mask[:, :, np.newaxis]
+    comp  = (rgb * alpha + 0.5 * (1.0 - alpha)) * 255
+    return Image.fromarray(comp.clip(0, 255).astype(np.uint8))
 def preview_rembg(input_image, do_remove_bg, threshold, erode_px):
         return input_image
+# ── RealESRGAN helpers ─────────────────────────────────────────────────────────
+def _load_realesrgan(scale: int = 4):
+    """Load RealESRGAN upsampler. Returns RealESRGANer or None."""
+    try:
+        from basicsr.archs.rrdbnet_arch import RRDBNet
+        from realesrgan import RealESRGANer
+        if scale == 4:
+            model_path = str(CKPT_DIR / "RealESRGAN_x4plus.pth")
+            model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
+        else:
+            model_path = str(CKPT_DIR / "RealESRGAN_x2plus.pth")
+            model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
+        if not os.path.exists(model_path):
+            print(f"[RealESRGAN] {model_path} not found")
+            return None
+        upsampler = RealESRGANer(
+            scale=scale, model_path=model_path, model=model,
+            tile=512, tile_pad=32, pre_pad=0, half=True,
+        )
+        print(f"[RealESRGAN] Loaded x{scale}plus")
+        return upsampler
+    except Exception as e:
+        print(f"[RealESRGAN] Load failed: {e}")
+        return None
+def _enhance_glb_texture(glb_path: str) -> bool:
+    """
+    Extract the base-color UV texture atlas from a GLB, upscale with RealESRGAN x4,
+    downscale back to original resolution (sharper detail), then repack in-place.
+    Returns True if enhancement was applied.
+    """
+    import pygltflib
+    upsampler = _load_realesrgan(scale=4)
+    if upsampler is None:
+        upsampler = _load_realesrgan(scale=2)
+    if upsampler is None:
+        print("[enhance_glb] No RealESRGAN checkpoint available")
+        return False
+    glb = pygltflib.GLTF2().load(glb_path)
+    blob = bytearray(glb.binary_blob() or b"")
+    for mat in glb.materials:
+        bct = getattr(mat.pbrMetallicRoughness, "baseColorTexture", None)
+        if bct is None:
+            continue
+        tex = glb.textures[bct.index]
+        if tex.source is None:
+            continue
+        img_obj = glb.images[tex.source]
+        if img_obj.bufferView is None:
+            continue
+        bv = glb.bufferViews[img_obj.bufferView]
+        offset, length = bv.byteOffset or 0, bv.byteLength
+        img_arr = np.frombuffer(blob[offset:offset + length], dtype=np.uint8)
+        atlas_bgr = cv2.imdecode(img_arr, cv2.IMREAD_COLOR)
+        if atlas_bgr is None:
+            continue
+        orig_h, orig_w = atlas_bgr.shape[:2]
+        print(f"[enhance_glb] atlas {orig_w}x{orig_h}, upscaling with RealESRGAN…")
+        try:
+            upscaled, _ = upsampler.enhance(atlas_bgr, outscale=4)
+        except Exception as e:
+            print(f"[enhance_glb] RealESRGAN enhance failed: {e}")
+            continue
+        restored = cv2.resize(upscaled, (orig_w, orig_h), interpolation=cv2.INTER_LANCZOS4)
+        ok, new_bytes = cv2.imencode(".png", restored)
+        if not ok:
+            continue
+        new_bytes = new_bytes.tobytes()
+        new_len = len(new_bytes)
+        if new_len > length:
+            before = bytes(blob[:offset])
+            after  = bytes(blob[offset + length:])
+            blob   = bytearray(before + new_bytes + after)
+            delta  = new_len - length
+            bv.byteLength = new_len
+            for other_bv in glb.bufferViews:
+                if (other_bv.byteOffset or 0) > offset:
+                    other_bv.byteOffset += delta
+            glb.buffers[0].byteLength += delta
+        else:
+            blob[offset:offset + new_len] = new_bytes
+            bv.byteLength = new_len
+        glb.set_binary_blob(bytes(blob))
+        glb.save(glb_path)
+        print(f"[enhance_glb] GLB texture enhanced OK (was {length}B → {new_len}B)")
+        return True
+    print("[enhance_glb] No base-color texture found in GLB")
+    return False
+# ── FireRed GPU functions ──────────────────────────────────────────────────────
+@spaces.GPU(duration=600)
+def firered_generate(gallery_images, prompt, seed, randomize_seed, guidance_scale, steps, progress=gr.Progress()):
+    """Run FireRed image-edit inference on one or more reference images (max 3 natively)."""
+    pil_images = _gallery_to_pil_list(gallery_images)
+    if not pil_images:
+        return None, int(seed), "Please upload at least one image."
+    if not prompt or not prompt.strip():
+        return None, int(seed), "Please enter an edit prompt."
+    try:
+        import gc
+        progress(0.05, desc="Loading FireRed pipeline...")
+        pipe = load_firered()
+        if randomize_seed:
+            seed = random.randint(0, 2**31 - 1)
+        # FireRed natively handles 1-3 images; cap silently and warn
+        if len(pil_images) > 3:
+            print(f"[FireRed] {len(pil_images)} images given, truncating to 3 (native limit).")
+            pil_images = pil_images[:3]
+        # Resize to max 1024px and align to multiple of 8 (prevents padding bars)
+        pil_images = [_firered_resize(img) for img in pil_images]
+        height, width = pil_images[0].height, pil_images[0].width
+        print(f"[FireRed] Input size after resize: {width}x{height}")
+        generator = torch.Generator(device=DEVICE).manual_seed(int(seed))
+        progress(0.4, desc=f"Running FireRed edit ({len(pil_images)} image(s))...")
+        with torch.inference_mode():
+            result = pipe(
+                image=pil_images,
+                prompt=prompt.strip(),
+                negative_prompt=_FIRERED_NEGATIVE,
+                num_inference_steps=int(steps),
+                generator=generator,
+                true_cfg_scale=float(guidance_scale),
+                num_images_per_prompt=1,
+                height=height,
+                width=width,
+            ).images[0]
+        gc.collect()
+        torch.cuda.empty_cache()
+        progress(1.0, desc="Done!")
+        n = len(pil_images)
+        note = " (truncated to 3)" if n == 3 and len(_gallery_to_pil_list(gallery_images)) > 3 else ""
+        return np.array(result), int(seed), f"Preview ready — {n} image(s) used{note}."
+    except Exception:
+        return None, int(seed), f"FireRed error:\n{traceback.format_exc()}"
+@spaces.GPU(duration=60)
+def firered_load_into_pipeline(firered_output, threshold, erode_px, progress=gr.Progress()):
+    """Load a FireRed output into the main pipeline with automatic background removal."""
+    if firered_output is None:
+        return None, None, "No FireRed output — generate an image first."
+    try:
+        progress(0.1, desc="Loading RMBG model...")
+        load_rmbg_only()
+        img = Image.fromarray(firered_output).convert("RGB")
+        if _rmbg_net is not None:
+            progress(0.5, desc="Removing background...")
+            composited = _remove_bg_rmbg(img, threshold=float(threshold), erode_px=int(erode_px))
+            result = np.array(composited)
+            msg = "Loaded into pipeline — background removed."
+        else:
+            result = firered_output
+            msg = "Loaded into pipeline (RMBG unavailable — background not removed)."
+        progress(1.0, desc="Done!")
+        return result, result, msg
+    except Exception:
+        return None, None, f"Error:\n{traceback.format_exc()}"
 # ── Stage 1: Shape generation ─────────────────────────────────────────────────
 @spaces.GPU(duration=180)
     if input_image is None:
         return None, "Please upload an image."
     try:
+        progress(0.05, desc="Freeing VRAM from FireRed (if loaded)...")
+        global _firered_pipe
+        if _firered_pipe is not None:
+            # dispatch_model attaches accelerate hooks — remove them before .to("cpu")
+            try:
+                from accelerate.hooks import remove_hook_from_submodules
+                remove_hook_from_submodules(_firered_pipe.transformer)
+                _firered_pipe.transformer.to("cpu")
+            except Exception as _e:
+                print(f"[TripoSG] Transformer CPU offload: {_e}")
+            try:
+                _firered_pipe.text_encoder.to("cpu")
+            except Exception:
+                pass
+            try:
+                _firered_pipe.vae.to("cpu")
+            except Exception:
+                pass
+            _firered_pipe = None
+            torch.cuda.empty_cache()
+            print("[TripoSG] FireRed offloaded — VRAM freed for shape generation.")
         progress(0.1, desc="Loading TripoSG...")
         pipe, rmbg_net = load_triposg()
         img.save(img_path)
         progress(0.5, desc="Generating shape (SDF diffusion)...")
+        with torch.autocast(device_type="cuda", dtype=torch.float16):
+            from scripts.inference_triposg import run_triposg
+            mesh = run_triposg(
+                pipe=pipe,
+                image_input=img_path,
+                rmbg_net=rmbg_net if remove_background else None,
+                seed=int(seed),
+                num_inference_steps=int(num_steps),
+                guidance_scale=float(guidance_scale),
+                faces=int(face_count) if int(face_count) > 0 else -1,
+            )
         out_path = "/tmp/triposg_shape.glb"
         mesh.export(out_path)
             animated = mdm_result.get("animated_glb")
         parts = ["Rigged: " + os.path.basename(rigged)]
+        if fbx:      parts.append("FBX: " + os.path.basename(fbx))
         if animated: parts.append("Animation: " + os.path.basename(animated))
         torch.cuda.empty_cache()
         from pipeline.enhance_surface import (
             run_stable_normal, run_depth_anything,
             bake_normal_into_glb, bake_depth_as_occlusion,
+            unload_models,
         )
         import pipeline.enhance_surface as _enh_mod
         return []
+# ── HyperSwap views ───────────────────────────────────────────────────────────
+@spaces.GPU(duration=120)
+def hyperswap_views(embedding_json: str):
+    """
+    Stage 6 — run HyperSwap on the last rendered views.
+    embedding_json: JSON string of the 512-d ArcFace embedding list.
+    Returns a gallery of (swapped_image_path, view_name) tuples.
+    """
+    global _hyperswap_sess
+    try:
+        import onnxruntime as ort
+        from insightface.app import FaceAnalysis
+        embedding = np.array(json.loads(embedding_json), dtype=np.float32)
+        embedding /= np.linalg.norm(embedding)
+        # Load HyperSwap once
+        if _hyperswap_sess is None:
+            hs_path = str(CKPT_DIR / "hyperswap_1a_256.onnx")
+            _hyperswap_sess = ort.InferenceSession(hs_path, providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
+            print(f"[hyperswap_views] Loaded {hs_path}")
+        app = FaceAnalysis(name="buffalo_l", providers=["CPUExecutionProvider"])
+        app.prepare(ctx_id=0, det_size=(640, 640), det_thresh=0.1)
+        results = []
+        for view_path, name in zip(VIEW_PATHS, VIEW_NAMES):
+            if not os.path.exists(view_path):
+                print(f"[hyperswap_views] Missing {view_path}, skipping")
+                continue
+            bgr = cv2.imread(view_path)
+            faces = app.get(bgr)
+            if not faces:
+                print(f"[hyperswap_views] {name}: no face detected")
+                out_path = view_path  # return original
+            else:
+                face = faces[0]
+                M, _ = cv2.estimateAffinePartial2D(face.kps, ARCFACE_256,
+                                                    method=cv2.RANSAC, ransacReprojThreshold=100)
+                H, W = bgr.shape[:2]
+                aligned = cv2.warpAffine(bgr, M, (256, 256), flags=cv2.INTER_LINEAR)
+                t = ((aligned.astype(np.float32) / 255 - 0.5) / 0.5)[:, :, ::-1].copy().transpose(2, 0, 1)[None]
+                out, mask = _hyperswap_sess.run(None, {
+                    "source": embedding.reshape(1, -1),
+                    "target": t,
+                })
+                out_bgr = (((out[0].transpose(1, 2, 0) + 1) / 2 * 255)
+                           .clip(0, 255).astype(np.uint8))[:, :, ::-1].copy()
+                m = (mask[0, 0] * 255).clip(0, 255).astype(np.uint8)
+                Mi = cv2.invertAffineTransform(M)
+                of = cv2.warpAffine(out_bgr, Mi, (W, H), flags=cv2.INTER_LINEAR)
+                mf = cv2.warpAffine(m, Mi, (W, H), flags=cv2.INTER_LINEAR).astype(np.float32)[:, :, None] / 255
+                swapped = (of * mf + bgr * (1 - mf)).clip(0, 255).astype(np.uint8)
+                # GFPGAN face restoration
+                restorer = load_gfpgan()
+                if restorer is not None:
+                    b = face.bbox.astype(int)
+                    h2, w2 = swapped.shape[:2]
+                    pad = 0.35
+                    bw2, bh2 = b[2]-b[0], b[3]-b[1]
+                    cx1 = max(0, b[0]-int(bw2*pad)); cy1 = max(0, b[1]-int(bh2*pad))
+                    cx2 = min(w2, b[2]+int(bw2*pad)); cy2 = min(h2, b[3]+int(bh2*pad))
+                    crop = swapped[cy1:cy2, cx1:cx2]
+                    try:
+                        _, _, rest = restorer.enhance(
+                            crop, has_aligned=False, only_center_face=True,
+                            paste_back=True, weight=0.5)
+                        if rest is not None:
+                            ch, cw = cy2 - cy1, cx2 - cx1
+                            if rest.shape[:2] != (ch, cw):
+                                rest = cv2.resize(rest, (cw, ch), interpolation=cv2.INTER_LANCZOS4)
+                            swapped[cy1:cy2, cx1:cx2] = rest
+                    except Exception as _ge:
+                        print(f"[hyperswap_views] GFPGAN failed: {_ge}")
+                out_path = view_path.replace("render_", "swapped_")
+                cv2.imwrite(out_path, swapped)
+                print(f"[hyperswap_views] {name}: swapped+restored OK -> {out_path}")
+            results.append((out_path, name))
+        return results
+    except Exception:
+        err = traceback.format_exc()
+        print(f"hyperswap_views FAILED:\n{err}")
+        return []
+# ── Animate tab functions ─────────────────────────────────────────────────────
+def gradio_search_motions(query: str, progress=gr.Progress()):
+    """Stream TeoGchx/HumanML3D and return matching motions as radio choices."""
+    if not query.strip():
+        return (
+            gr.update(choices=[], visible=False),
+            [],
+            "Enter a motion description and click Search.",
+        )
+    try:
+        progress(0.1, desc="Connecting to HumanML3D dataset…")
+        sys.path.insert(0, str(HERE))
+        from Retarget.search import search_motions, format_choice_label
+        progress(0.3, desc="Streaming dataset…")
+        results = search_motions(query, top_k=8)
+        progress(1.0)
+        if not results:
+            return (
+                gr.update(choices=["No matches — try different keywords"], visible=True),
+                [],
+                f"No motions matched '{query}'. Try broader terms.",
+            )
+        choices = [format_choice_label(r) for r in results]
+        status  = f"Found {len(results)} motions matching '{query}'"
+        return (
+            gr.update(choices=choices, value=choices[0], visible=True),
+            results,
+            status,
+        )
+    except Exception:
+        return (
+            gr.update(choices=[], visible=False),
+            [],
+            f"Search error:\n{traceback.format_exc()}",
+        )
+@spaces.GPU(duration=180)
+def gradio_animate(
+    rigged_glb_path,
+    selected_label: str,
+    motion_results: list,
+    fps: int,
+    max_frames: int,
+    progress=gr.Progress(),
+):
+    """Bake selected HumanML3D motion onto the rigged GLB."""
+    try:
+        glb = rigged_glb_path or "/tmp/rig_out/rigged.glb"
+        if not os.path.exists(glb):
+            return None, "No rigged GLB — run the Rig step first.", None
+        if not motion_results or not selected_label:
+            return None, "No motion selected — run Search first.", None
+        # Resolve which result was selected
+        sys.path.insert(0, str(HERE))
+        from Retarget.search import format_choice_label
+        idx = 0
+        for i, r in enumerate(motion_results):
+            if format_choice_label(r) == selected_label:
+                idx = i
+                break
+        chosen  = motion_results[idx]
+        motion  = chosen["motion"]              # np.ndarray [T, 263]
+        caption = chosen["caption"]
+        T_total = motion.shape[0]
+        n_frames = min(max_frames, T_total) if max_frames > 0 else T_total
+        progress(0.2, desc="Parsing skeleton…")
+        from Retarget.animate import animate_glb_from_hml3d
+        out_path = "/tmp/animated_out/animated.glb"
+        os.makedirs("/tmp/animated_out", exist_ok=True)
+        progress(0.4, desc="Mapping bones to SMPL joints…")
+        animated = animate_glb_from_hml3d(
+            motion=motion,
+            rigged_glb=glb,
+            output_glb=out_path,
+            fps=int(fps),
+            num_frames=int(n_frames),
+        )
+        progress(1.0, desc="Done!")
+        status = (
+            f"Animated: {n_frames} frames @ {fps} fps\n"
+            f"Motion: {caption[:120]}"
+        )
+        return animated, status, animated
+    except Exception:
+        return None, f"Error:\n{traceback.format_exc()}", None
+# ── PSHuman Face Transplant ────────────────────────────────────────────────────
+def gradio_pshuman_face(
+    input_image,
+    rigged_glb_path,
+    weight_threshold: float,
+    retract_mm: float,
+    pshuman_url: str,
+    progress=gr.Progress(),
+):
+    """
+    Full PSHuman face transplant pipeline:
+      1. Run PSHuman on input_image → colored OBJ face mesh
+      2. Run face_transplant.py → stitch face into rigged GLB
+      3. Return the combined GLB
+    PSHuman runs as a remote service (pshuman_url). On ZeroGPU the service_url
+    must point to an externally-deployed PSHuman endpoint (PSHUMAN_URL env var
+    or user-provided URL in the UI). Local localhost will not work on ZeroGPU.
+    """
+    try:
+        if input_image is None:
+            return None, "Upload a portrait image first.", None
+        rigged = rigged_glb_path
+        if not rigged or not os.path.exists(str(rigged)):
+            return None, "No rigged GLB found — run the Rig step first.", None
+        work_dir = tempfile.mkdtemp(prefix="pshuman_transplant_")
+        img_path = os.path.join(work_dir, "portrait.png")
+        if isinstance(input_image, np.ndarray):
+            Image.fromarray(input_image).save(img_path)
+        else:
+            input_image.save(img_path)
+        # pipeline/ is already in sys.path via PIPELINE_DIR insertion at startup
+        # ── Step 1: PSHuman inference ──────────────────────────────────────────
+        progress(0.05, desc="Step 1/2: Running PSHuman (generates multi-view face)...")
+        from pipeline.pshuman_client import generate_pshuman_mesh
+        face_obj = os.path.join(work_dir, "pshuman_face.obj")
+        generate_pshuman_mesh(
+            image_path  = img_path,
+            output_path = face_obj,
+            service_url = pshuman_url.strip() or "http://localhost:7862",
+        )
+        # ── Step 2: Face transplant ────────────────────────────────────────────
+        progress(0.7, desc="Step 2/2: Stitching PSHuman face into rigged GLB...")
+        out_glb = os.path.join(work_dir, "rigged_pshuman_face.glb")
+        from pipeline.face_transplant import transplant_face
+        transplant_face(
+            body_glb_path      = str(rigged),
+            pshuman_mesh_path  = face_obj,
+            output_path        = out_glb,
+            weight_threshold   = float(weight_threshold),
+            retract_amount     = float(retract_mm) / 1000.0,  # mm → metres
+        )
+        progress(1.0, desc="Done!")
+        return out_glb, "PSHuman face transplant complete.", out_glb
+    except Exception:
+        return None, f"Error:\n{traceback.format_exc()}", None
 # ── Full pipeline ─────────────────────────────────────────────────────────────
+def run_full_pipeline(input_image, remove_background, num_steps, guidance, seed, face_count,
+                      variant, tex_seed, enhance_face, rembg_threshold, rembg_erode,
                       export_fbx, mdm_prompt, mdm_n_frames, progress=gr.Progress()):
     progress(0.0, desc="Stage 1/3: Generating shape...")
+    glb, status = generate_shape(input_image, remove_background, num_steps, guidance, seed, face_count)
     if not glb:
         return None, None, None, None, None, None, status
     progress(0.33, desc="Stage 2/3: Applying texture...")
+    glb, mv_img, status = apply_texture(glb, input_image, remove_background, variant, tex_seed,
+                                         enhance_face, rembg_threshold, rembg_erode)
     if not glb:
         return None, None, None, None, None, None, status
 # ── UI ────────────────────────────────────────────────────────────────────────
+with gr.Blocks(title="Image2Model", theme=gr.themes.Soft()) as demo:
     gr.Markdown("# Image2Model — Portrait to Rigged 3D Mesh")
+    glb_state        = gr.State(None)
+    rigged_glb_state = gr.State(None)   # persists rigged GLB for Animate + PSHuman tabs
+    with gr.Tabs() as tabs:
+        # ════════════════════════════════════════════════════════════════════
+        with gr.Tab("Edit", id=0):
+            gr.Markdown(
+                "### Image Edit — FireRed\n"
+                "Upload one or more reference images, write an edit prompt, preview the result, "
+                "then click **Load to Generate** to send it to the 3D pipeline."
+            )
+            with gr.Row():
+                with gr.Column(scale=1):
+                    firered_gallery = gr.Gallery(
+                        label="Reference Images (1–3 images, drag & drop)",
+                        interactive=True,
+                        columns=3,
+                        height=220,
+                        object_fit="contain",
+                    )
+                    firered_prompt = gr.Textbox(
+                        label="Edit Prompt",
+                        placeholder="make the person wear a red jacket",
+                        lines=2,
+                    )
+                    with gr.Row():
+                        firered_seed     = gr.Number(value=_init_seed, label="Seed", precision=0)
+                        firered_rand     = gr.Checkbox(label="Random Seed", value=True)
+                    with gr.Row():
+                        firered_guidance = gr.Slider(1.0, 10.0, value=1.0, step=0.5,
+                                                     label="Guidance Scale")
+                        firered_steps    = gr.Slider(1, 40, value=4, step=1,
+                                                     label="Inference Steps")
+                    firered_btn    = gr.Button("Generate Preview", variant="secondary")
+                    firered_status = gr.Textbox(label="Status", lines=2, interactive=False)
+                with gr.Column(scale=1):
+                    firered_output_img = gr.Image(label="FireRed Output", type="numpy",
+                                                   interactive=False)
+                    load_to_generate_btn = gr.Button("Load to Generate", variant="primary")
         # ════════════════════════════════════════════════════════════════════
+        with gr.Tab("Generate", id=1):
             with gr.Row():
                 with gr.Column(scale=1):
+                    input_image = gr.Image(label="Input Image", type="numpy")
+                    remove_bg_check = gr.Checkbox(label="Remove Background", value=True)
+                    with gr.Row():
+                        rembg_threshold = gr.Slider(0.1, 0.95, value=0.5, step=0.05,
+                                                    label="BG Threshold (higher = stricter)")
+                        rembg_erode = gr.Slider(0, 8, value=2, step=1,
+                                                label="Edge Erode (px)")
                     with gr.Accordion("Shape Settings", open=True):
                         num_steps  = gr.Slider(20, 100, value=50, step=5,  label="Inference Steps")
                         shape_btn   = gr.Button("Generate Shape",  variant="primary",   scale=2, interactive=False)
                         texture_btn = gr.Button("Apply Texture",   variant="secondary", scale=2)
                         render_btn  = gr.Button("Render Views",    variant="secondary", scale=1)
+                    run_all_btn = gr.Button("▶ Run Full Pipeline (Shape + Texture + Rig)", variant="primary", interactive=False)
                 with gr.Column(scale=1):
+                    rembg_preview  = gr.Image(label="BG Removed Preview", type="numpy",
+                                              interactive=False)
                     status         = gr.Textbox(label="Status", lines=3, interactive=False)
                     model_3d       = gr.Model3D(label="3D Preview", clear_color=[0.9, 0.9, 0.9, 1.0])
                     download_file  = gr.File(label="Download GLB")
             render_gallery = gr.Gallery(label="Rendered Views", columns=5, height=300)
+            # ── wiring: Generate tab ──────────────────────────────────────────
+            _rembg_inputs  = [input_image, remove_bg_check, rembg_threshold, rembg_erode]
             _pipeline_btns = [shape_btn, run_all_btn]
             input_image.upload(
                 inputs=[], outputs=_pipeline_btns,
             )
+            input_image.upload(fn=preview_rembg,      inputs=_rembg_inputs, outputs=[rembg_preview])
+            remove_bg_check.change(fn=preview_rembg,  inputs=_rembg_inputs, outputs=[rembg_preview])
+            rembg_threshold.release(fn=preview_rembg, inputs=_rembg_inputs, outputs=[rembg_preview])
+            rembg_erode.release(fn=preview_rembg,     inputs=_rembg_inputs, outputs=[rembg_preview])
             shape_btn.click(
+                fn=generate_shape,
+                inputs=[input_image, remove_bg_check, num_steps, guidance, seed, face_count],
                 outputs=[glb_state, status],
             ).then(
                 fn=lambda p: (p, p) if p else (None, None),
             )
             texture_btn.click(
+                fn=apply_texture,
+                inputs=[glb_state, input_image, remove_bg_check, variant, tex_seed,
+                        enhance_face_check, rembg_threshold, rembg_erode],
                 outputs=[glb_state, multiview_img, status],
             ).then(
                 fn=lambda p: (p, p) if p else (None, None),
             render_btn.click(fn=render_views, inputs=[download_file], outputs=[render_gallery])
+        # ── Edit tab wiring (after Generate so all components are defined) ──
+        firered_btn.click(
+            fn=firered_generate,
+            inputs=[firered_gallery, firered_prompt, firered_seed, firered_rand,
+                    firered_guidance, firered_steps],
+            outputs=[firered_output_img, firered_seed, firered_status],
+            api_name="firered_generate",
+        )
+        load_to_generate_btn.click(
+            fn=firered_load_into_pipeline,
+            inputs=[firered_output_img, rembg_threshold, rembg_erode],
+            outputs=[input_image, rembg_preview, firered_status],
+        ).then(
+            fn=lambda img: (
+                gr.update(interactive=img is not None),
+                gr.update(interactive=img is not None),
+                gr.update(selected=1),
+            ),
+            inputs=[input_image],
+            outputs=[shape_btn, run_all_btn, tabs],
+        )
         # ════════════════════════════════════════════════════════════════════
         with gr.Tab("Rig & Export"):
             with gr.Row():
                 inputs=[glb_state, export_fbx_check, mdm_prompt_box, mdm_frames_slider],
                 outputs=[rig_glb_dl, rig_animated_dl, rig_fbx_dl, rig_status,
                          rig_model_3d, rigged_base_state, skel_glb_state],
+            ).then(
+                fn=lambda p: p,
+                inputs=[rigged_base_state], outputs=[rigged_glb_state],
             )
             show_skel_check.change(
                 outputs=[rig_model_3d],
             )
+        # ════════════════════════════════════════════════════════════════════
+        with gr.Tab("Animate"):
+            gr.Markdown(
+                "### Motion Search & Animate\n"
+                "Search the HumanML3D dataset for motions matching a description, "
+                "then bake the selected motion onto your rigged GLB."
+            )
+            with gr.Row():
+                with gr.Column(scale=1):
+                    motion_query   = gr.Textbox(
+                        label="Motion Description",
+                        placeholder="a person walks forward slowly",
+                        lines=2,
+                    )
+                    search_btn     = gr.Button("Search Motions", variant="secondary")
+                    motion_radio   = gr.Radio(
+                        label="Select Motion", choices=[], visible=False,
+                    )
+                    motion_results_state = gr.State([])
+                    gr.Markdown("### Animate Settings")
+                    animate_fps    = gr.Slider(10, 60, value=30, step=5, label="FPS")
+                    animate_frames = gr.Slider(0, 600, value=0, step=30,
+                                               label="Max Frames (0 = full motion)")
+                    animate_btn    = gr.Button("Animate", variant="primary")
+                with gr.Column(scale=2):
+                    animate_status = gr.Textbox(label="Status", lines=4, interactive=False)
+                    animate_model_3d = gr.Model3D(label="Animated Preview",
+                                                  clear_color=[0.9, 0.9, 0.9, 1.0])
+                    animate_dl     = gr.File(label="Download Animated GLB")
+            search_btn.click(
+                fn=gradio_search_motions,
+                inputs=[motion_query],
+                outputs=[motion_radio, motion_results_state, animate_status],
+            )
+            animate_btn.click(
+                fn=gradio_animate,
+                inputs=[rigged_glb_state, motion_radio, motion_results_state,
+                        animate_fps, animate_frames],
+                outputs=[animate_dl, animate_status, animate_model_3d],
+            )
+        # ════════════════════════════════════════════════════════════════════
+        with gr.Tab("PSHuman Face"):
+            gr.Markdown(
+                "### PSHuman Face Transplant\n"
+                "Generates a high-detail face mesh via PSHuman (multi-view diffusion), "
+                "then transplants it into the rigged GLB.\n\n"
+                "**Pipeline:** portrait → PSHuman (remote service) → colored OBJ → face_transplant → rigged GLB with HD face\n\n"
+                "**Note:** On ZeroGPU, PSHuman must run as a remote service. "
+                "Set `PSHUMAN_URL` environment variable or enter the URL below."
+            )
+            with gr.Row():
+                with gr.Column(scale=1):
+                    pshuman_img_input = gr.Image(
+                        label="Portrait image (same as used for Generate)",
+                        type="pil",
+                    )
+                    with gr.Accordion("Advanced settings", open=False):
+                        pshuman_weight_thresh = gr.Slider(
+                            minimum=0.1, maximum=0.9, value=0.35, step=0.05,
+                            label="Head bone weight threshold",
+                            info="Vertices with head-bone weight above this get replaced",
+                        )
+                        pshuman_retract_mm = gr.Slider(
+                            minimum=0.0, maximum=20.0, value=4.0, step=0.5,
+                            label="Face retract (mm)",
+                            info="How far to push original face verts inward to avoid z-fighting",
+                        )
+                        pshuman_service_url = gr.Textbox(
+                            label="PSHuman service URL",
+                            value=os.environ.get("PSHUMAN_URL", "http://localhost:7862"),
+                            info="pshuman_app.py Gradio endpoint (deployed separately)",
+                        )
+                    pshuman_btn = gr.Button("Generate HD Face", variant="primary")
+                with gr.Column(scale=2):
+                    pshuman_status   = gr.Textbox(label="Status", lines=4, interactive=False)
+                    pshuman_model_3d = gr.Model3D(
+                        label="Preview", clear_color=[0.9, 0.9, 0.9, 1.0])
+                    pshuman_glb_dl   = gr.File(label="Download GLB (with PSHuman face)")
+            pshuman_btn.click(
+                fn=gradio_pshuman_face,
+                inputs=[
+                    pshuman_img_input,
+                    rigged_glb_state,
+                    pshuman_weight_thresh,
+                    pshuman_retract_mm,
+                    pshuman_service_url,
+                ],
+                outputs=[pshuman_glb_dl, pshuman_status, pshuman_model_3d],
+            )
         # ════════════════════════════════════════════════════════════════════
         with gr.Tab("Enhancement"):
             gr.Markdown("**Surface Enhancement** — bakes normal + depth maps into the GLB as PBR textures.")
                     displacement_scale = gr.Slider(0.1, 3.0, value=1.0, step=0.1, label="Displacement Scale")
                     enhance_btn = gr.Button("Run Enhancement", variant="primary")
+                    unload_btn  = gr.Button("Unload Models (free VRAM)", variant="secondary")
                 with gr.Column(scale=2):
                     enhance_status    = gr.Textbox(label="Status", lines=5, interactive=False)
                          enhanced_glb_dl, enhanced_model_3d, enhance_status],
             )
+            def _unload_enhancement_models():
+                try:
+                    from pipeline.enhance_surface import unload_models
+                    unload_models()
+                    return "Enhancement models unloaded — VRAM freed."
+                except Exception as e:
+                    return f"Unload failed: {e}"
+            unload_btn.click(
+                fn=_unload_enhancement_models,
+                inputs=[], outputs=[enhance_status],
+            )
+        # ════════════════════════════════════════════════════════════════════
+        with gr.Tab("Settings"):
+            def get_vram_status():
+                lines = []
+                if torch.cuda.is_available():
+                    alloc  = torch.cuda.memory_allocated()  / 1024**3
+                    reserv = torch.cuda.memory_reserved()   / 1024**3
+                    total  = torch.cuda.get_device_properties(0).total_memory / 1024**3
+                    free   = total - reserv
+                    lines.append(f"GPU: {torch.cuda.get_device_name(0)}")
+                    lines.append(f"VRAM total:     {total:.1f} GB")
+                    lines.append(f"VRAM allocated: {alloc:.1f} GB")
+                    lines.append(f"VRAM reserved:  {reserv:.1f} GB")
+                    lines.append(f"VRAM free:      {free:.1f} GB")
+                else:
+                    lines.append("No CUDA device available.")
+                lines.append("")
+                lines.append("Loaded models:")
+                lines.append(f"  TripoSG pipeline: {'loaded' if _triposg_pipe is not None else 'not loaded'}")
+                lines.append(f"  RMBG-{_rmbg_version or '?'}:        {'loaded' if _rmbg_net is not None else 'not loaded'}")
+                lines.append(f"  FireRed:          {'loaded' if _firered_pipe is not None else 'not loaded'}")
+                try:
+                    import pipeline.enhance_surface as _enh_mod
+                    lines.append(f"  StableNormal:     {'loaded' if _enh_mod._normal_pipe is not None else 'not loaded'}")
+                    lines.append(f"  Depth-Anything:   {'loaded' if _enh_mod._depth_pipe is not None else 'not loaded'}")
+                except Exception:
+                    lines.append("  StableNormal / Depth-Anything: (status unavailable)")
+                return "\n".join(lines)
+            def _preload_triposg():
+                try:
+                    load_triposg()
+                    return get_vram_status()
+                except Exception:
+                    return f"Preload failed:\n{traceback.format_exc()}"
+            def _unload_triposg():
+                global _triposg_pipe, _rmbg_net
+                with _model_load_lock:
+                    if _triposg_pipe is not None:
+                        _triposg_pipe.to("cpu")
+                        del _triposg_pipe
+                        _triposg_pipe = None
+                    if _rmbg_net is not None:
+                        _rmbg_net.to("cpu")
+                        del _rmbg_net
+                        _rmbg_net = None
+                torch.cuda.empty_cache()
+                return get_vram_status()
+            def _unload_enhancement():
+                try:
+                    from pipeline.enhance_surface import unload_models
+                    unload_models()
+                except Exception:
+                    pass
+                return get_vram_status()
+            def _unload_all():
+                _unload_triposg()
+                _unload_enhancement()
+                return get_vram_status()
+            with gr.Row():
+                with gr.Column(scale=1):
+                    gr.Markdown("### VRAM Management")
+                    preload_btn         = gr.Button("Preload TripoSG + RMBG to VRAM", variant="primary")
+                    unload_triposg_btn  = gr.Button("Unload TripoSG / RMBG")
+                    unload_enh_btn      = gr.Button("Unload Enhancement Models (StableNormal / Depth)")
+                    unload_all_btn      = gr.Button("Unload All Models", variant="stop")
+                    refresh_btn         = gr.Button("Refresh Status")
+                with gr.Column(scale=1):
+                    gr.Markdown("### GPU Status")
+                    vram_status = gr.Textbox(
+                        label="", lines=12, interactive=False,
+                        value="Click Refresh to check VRAM status.",
+                    )
+            preload_btn.click(fn=_preload_triposg,    inputs=[], outputs=[vram_status])
+            unload_triposg_btn.click(fn=_unload_triposg, inputs=[], outputs=[vram_status])
+            unload_enh_btn.click(fn=_unload_enhancement, inputs=[], outputs=[vram_status])
+            unload_all_btn.click(fn=_unload_all,      inputs=[], outputs=[vram_status])
+            refresh_btn.click(fn=get_vram_status,     inputs=[], outputs=[vram_status])
+        # ── Run All wiring (after all tabs so components are defined) ────────
         run_all_btn.click(
             fn=run_full_pipeline,
             inputs=[
+                input_image, remove_bg_check, num_steps, guidance, seed, face_count,
+                variant, tex_seed, enhance_face_check, rembg_threshold, rembg_erode,
                 export_fbx_check, mdm_prompt_box, mdm_frames_slider,
             ],
             outputs=[glb_state, download_file, multiview_img,
             inputs=[glb_state], outputs=[model_3d, download_file],
         )
+    # ── Hidden API endpoints ──────────────────────────────────────────────────
+    _api_render_gallery = gr.Gallery(visible=False)
+    _api_swap_gallery   = gr.Gallery(visible=False)
+    def _render_last():
+        path = _last_glb_path or "/tmp/triposg_textured.glb"
+        return render_views(path)
+    _hs_emb_input = gr.Textbox(visible=False)
+    gr.Button(visible=False).click(
+        fn=_render_last, inputs=[], outputs=[_api_render_gallery], api_name="render_last")
+    gr.Button(visible=False).click(
+        fn=hyperswap_views, inputs=[_hs_emb_input], outputs=[_api_swap_gallery],
+        api_name="hyperswap_views")
 if __name__ == "__main__":
+    demo.launch(server_name="0.0.0.0", server_port=7860, theme=gr.themes.Soft(),
+                show_error=True, allowed_paths=["/tmp"])

pipeline/face_inswap_bake.py ADDED Viewed

	@@ -0,0 +1,302 @@

+"""
+face_inswap_bake.py — Proper face swap on rendered views, then UV-bake.
+Pipeline:
+  1. Render the mesh from multiple views (front + L/R 3-quarter)
+  2. Run inswapper_128 to swap reference face onto each rendered view
+  3. uv_render_attr() bakes each swapped render directly into UV texture
+     (render-space coords shared with UV lookup — no coordinate transforms)
+  4. Composite multiple views (front takes priority, sides fill gaps)
+  5. Save updated GLB
+Usage:
+    python face_inswap_bake.py \
+        --body   /tmp/triposg_textured.glb \
+        --face   /tmp/triposg_face_ref.png \
+        --out    /tmp/face_swapped.glb \
+        [--uv_size 4096] [--debug_dir /tmp]
+"""
+import os, sys, argparse, warnings
+warnings.filterwarnings('ignore')
+import numpy as np
+import cv2
+import torch
+import torch.nn.functional as F
+from PIL import Image
+import trimesh
+from trimesh.visual.texture import TextureVisuals
+from trimesh.visual.material import PBRMaterial
+sys.path.insert(0, '/root/MV-Adapter')
+from mvadapter.utils.mesh_utils import (
+    NVDiffRastContextWrapper, load_mesh, get_orthogonal_camera, render,
+)
+from mvadapter.utils.mesh_utils.uv import (
+    uv_precompute, uv_render_geometry, uv_render_attr,
+)
+from insightface.app import FaceAnalysis
+import insightface
+from gfpgan import GFPGANer
+GFPGAN_PATH = '/root/MV-Adapter/checkpoints/GFPGANv1.4.pth'
+# ── helpers ───────────────────────────────────────────────────────────────────
+def _build_front_face_uv_mask(mesh_t, tex_H, tex_W, neck_frac=0.76):
+    """UV-space mask covering only front-facing head triangles (no back-of-head)."""
+    verts = np.array(mesh_t.vertices, dtype=np.float64)
+    faces = np.array(mesh_t.faces,    dtype=np.int32)
+    uvs   = np.array(mesh_t.visual.uv, dtype=np.float64)
+    y_min, y_max = verts[:, 1].min(), verts[:, 1].max()
+    neck_y   = float(y_min + (y_max - y_min) * neck_frac)
+    head_idx = np.where(verts[:, 1] > neck_y)[0]
+    hv = verts[head_idx]
+    z_thresh = float(np.percentile(hv[:, 2], 40))
+    front    = hv[:, 2] >= z_thresh
+    if front.sum() < 30:
+        front = np.ones(len(hv), bool)
+    face_vert_idx  = head_idx[front]
+    face_vert_mask = np.zeros(len(verts), bool)
+    face_vert_mask[face_vert_idx] = True
+    face_tri_mask  = face_vert_mask[faces].all(axis=1)
+    face_tris      = faces[face_tri_mask]
+    print(f'  Geometry mask: {face_tri_mask.sum()} front-face triangles '
+          f'(neck_y={neck_y:.3f}, z_thresh={z_thresh:.3f})')
+    geom_mask = np.zeros((tex_H, tex_W), dtype=np.float32)
+    pts_list = []
+    for tri in face_tris:
+        uv = uvs[tri]
+        px = uv[:, 0] * tex_W
+        py = (1.0 - uv[:, 1]) * tex_H
+        pts_list.append(np.column_stack([px, py]).astype(np.int32))
+    if pts_list:
+        cv2.fillPoly(geom_mask, pts_list, 1.0)
+    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
+    geom_mask = cv2.dilate(geom_mask, kernel, iterations=2)
+    geom_mask = cv2.erode(geom_mask,  kernel, iterations=1)
+    geom_mask = cv2.GaussianBlur(geom_mask, (31, 31), 8)
+    return geom_mask
+def _detect_largest_face(img_bgr, app):
+    faces = app.get(img_bgr)
+    if not faces:
+        return None
+    return max(faces, key=lambda f: (f.bbox[2]-f.bbox[0])*(f.bbox[3]-f.bbox[1]))
+def _render_view(ctx, mesh_mv, uv_pre, azimuth_deg, H, W, device):
+    """Render the mesh from a given azimuth; return (camera, uv_geom)."""
+    camera = get_orthogonal_camera(
+        elevation_deg=[0], distance=[1.8],
+        left=-0.55, right=0.55, bottom=-0.55, top=0.55,
+        azimuth_deg=[azimuth_deg], device=device,
+    )
+    uv_geom = uv_render_geometry(
+        ctx, mesh_mv, camera,
+        view_height=H, view_width=W,
+        uv_precompute_output=uv_pre,
+        compute_depth_grad=False,
+    )
+    return camera, uv_geom
+def face_inswap_bake(body_glb, face_img_path, out_glb,
+                     uv_size=4096, debug_dir=None):
+    device = 'cuda'
+    INSWAPPER_PATH = '/root/MV-Adapter/checkpoints/inswapper_128.onnx'
+    # ── Load GFPGAN enhancer ──────────────────────────────────────────────────
+    print('[fib] Loading GFPGANv1.4 ...')
+    enhancer = GFPGANer(
+        model_path=GFPGAN_PATH,
+        upscale=1,
+        arch='clean',
+        channel_multiplier=2,
+        bg_upsampler=None,
+    )
+    # ── Load mesh ─────────────────────────────────────────────────────────────
+    print(f'[fib] Loading mesh: {body_glb}')
+    ctx = NVDiffRastContextWrapper(device=device, context_type='cuda')
+    mesh_mv = load_mesh(body_glb, rescale=True, device=device)
+    scene_t = trimesh.load(body_glb)
+    if isinstance(scene_t, trimesh.Scene):
+        geom_name = list(scene_t.geometry.keys())[0]
+        mesh_t = scene_t.geometry[geom_name]
+    else:
+        mesh_t = scene_t; geom_name = None
+    orig_tex_np = np.array(mesh_t.visual.material.baseColorTexture, dtype=np.float32) / 255.0
+    uvs = np.array(mesh_t.visual.uv, dtype=np.float64)
+    tex_H, tex_W = orig_tex_np.shape[:2]
+    print(f'  Texture: {tex_W}×{tex_H}')
+    # Build geometry mask (front-face head triangles only) at UV resolution
+    print('[fib] Building front-face geometry UV mask ...')
+    geom_uv_mask = _build_front_face_uv_mask(mesh_t, uv_size, uv_size)
+    # Render dimensions (match triposg_app.py)
+    H_r, W_r = 1024, 768
+    # ── Precompute UV geometry ─────────────────────────────────────────────────
+    print(f'[fib] Precomputing UV geometry ({uv_size}×{uv_size}) ...')
+    uv_pre = uv_precompute(ctx, mesh_mv, height=uv_size, width=uv_size)
+    # ── Load face swap model + face detector ──────────────────────────────────
+    print('[fib] Loading inswapper_128 ...')
+    swapper = insightface.model_zoo.get_model(
+        INSWAPPER_PATH, download=False,
+        providers=['CUDAExecutionProvider', 'CPUExecutionProvider'],
+    )
+    app = FaceAnalysis(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
+    app.prepare(ctx_id=0, det_size=(640, 640))
+    ref_bgr = cv2.imread(face_img_path)
+    ref_face = _detect_largest_face(ref_bgr, app)
+    if ref_face is None:
+        raise RuntimeError(f'No face detected in reference: {face_img_path}')
+    print(f'  Reference face detected: bbox={ref_face.bbox.astype(int).tolist()}')
+    # ── Process each view ─────────────────────────────────────────────────────
+    # Views: front (azimuth=-90), slight left (-60), slight right (-120)
+    # Azimuth convention from MV-Adapter: -90 = front-facing
+    views = [
+        ('front',       -90,   1.0),   # (name, azimuth_deg, priority_weight)
+        ('threequarter_r', -60,  0.7),
+        ('threequarter_l', -120, 0.7),
+    ]
+    # Accumulators for weighted UV compositing
+    uv_colour_acc = np.zeros((uv_size, uv_size, 3), dtype=np.float32)
+    uv_weight_acc = np.zeros((uv_size, uv_size),    dtype=np.float32)
+    for view_name, azimuth, weight in views:
+        print(f'\n[fib] View: {view_name} (azimuth={azimuth}°)')
+        # Create camera + UV geometry for this view
+        camera, uv_geom = _render_view(ctx, mesh_mv, uv_pre, azimuth, H_r, W_r, device)
+        # Render textured mesh from this view
+        render_out = render(ctx, mesh_mv, camera, height=H_r, width=W_r,
+                            render_attr=True, render_depth=False, render_normal=False,
+                            attr_background=0.0)
+        # render_out.attr: (1, H, W, 3) float in [0,1]
+        rendered_np = (render_out.attr[0].cpu().numpy() * 255).clip(0, 255).astype(np.uint8)
+        rendered_bgr = cv2.cvtColor(rendered_np, cv2.COLOR_RGB2BGR)
+        if debug_dir:
+            cv2.imwrite(os.path.join(debug_dir, f'fib_render_{view_name}.png'), rendered_bgr)
+        # Detect face in this rendered view
+        tgt_face = _detect_largest_face(rendered_bgr, app)
+        if tgt_face is None:
+            print(f'  No face in {view_name} render — skipping')
+            continue
+        print(f'  Target face: bbox={tgt_face.bbox.astype(int).tolist()}')
+        # Swap face
+        swapped_bgr = swapper.get(rendered_bgr.copy(), tgt_face, ref_face, paste_back=True)
+        # Enhance face detail with GFPGAN
+        _, _, enhanced_bgr = enhancer.enhance(
+            swapped_bgr, has_aligned=False, only_center_face=False, paste_back=True)
+        if enhanced_bgr is not None:
+            swapped_bgr = enhanced_bgr
+            print(f'  GFPGAN enhanced')
+        if debug_dir:
+            cv2.imwrite(os.path.join(debug_dir, f'fib_swapped_{view_name}.png'), swapped_bgr)
+        swapped_rgb = cv2.cvtColor(swapped_bgr, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+        # Build render-space face hull mask
+        kps = tgt_face.kps
+        hull_pts = cv2.convexHull(kps.astype(np.float32)).squeeze(1)
+        hull_cx, hull_cy = hull_pts.mean(axis=0)
+        hull_exp = (hull_pts - [hull_cx, hull_cy]) * 3.5 + [hull_cx, hull_cy]
+        face_mask = np.zeros((H_r, W_r), dtype=np.float32)
+        cv2.fillPoly(face_mask, [hull_exp.astype(np.int32)], 1.0)
+        face_mask = cv2.GaussianBlur(face_mask, (61, 61), 20)
+        # Bake swapped render into UV space
+        swapped_t = torch.tensor(swapped_rgb, device=device).unsqueeze(0)  # (1,H,W,3)
+        mask_t    = torch.tensor(face_mask[None], device=device)
+        uv_out = uv_render_attr(
+            images=swapped_t,
+            masks=mask_t,
+            uv_render_geometry_output=uv_geom,
+        )
+        uv_img  = uv_out.uv_attr_proj[0].cpu().numpy()   # (uv, uv, 3)
+        uv_mask = uv_out.uv_mask_proj[0].cpu().numpy()   # (uv, uv)
+        # Kill back-of-head UV islands
+        uv_mask = uv_mask * geom_uv_mask
+        # Weighted accumulate
+        w = uv_mask * weight
+        uv_colour_acc += uv_img * w[..., None]
+        uv_weight_acc += w
+        print(f'  Painted texels: {(uv_mask > 0.05).sum()}')
+    # ── Composite ──────────────────────────────────────────────────────────────
+    print('\n[fib] Compositing views ...')
+    valid = uv_weight_acc > 0.01
+    uv_final = np.where(valid[..., None],
+                        uv_colour_acc / np.maximum(uv_weight_acc[..., None], 1e-6),
+                        orig_tex_np[:uv_size, :uv_size] if uv_size <= tex_H else orig_tex_np)
+    # Resize to texture resolution if needed
+    if uv_size != tex_H or uv_size != tex_W:
+        uv_final_rs = cv2.resize(uv_final,         (tex_W, tex_H), interpolation=cv2.INTER_LINEAR)
+        weight_rs   = cv2.resize(uv_weight_acc,    (tex_W, tex_H), interpolation=cv2.INTER_LINEAR)
+    else:
+        uv_final_rs = uv_final
+        weight_rs   = uv_weight_acc
+    # Blend with original texture: use face-swap result where painted, orig elsewhere
+    alpha = np.clip(weight_rs, 0, 1)[..., None]
+    new_tex = uv_final_rs * alpha + orig_tex_np * (1.0 - alpha)
+    print(f'  Total painted texels (tex res): {(weight_rs > 0.05).sum()}')
+    if debug_dir:
+        Image.fromarray((uv_final_rs * 255).clip(0,255).astype(np.uint8)).save(
+            os.path.join(debug_dir, 'fib_uv_composite.png'))
+    # ── Save GLB ──────────────────────────────────────────────────────────────
+    new_pil = Image.fromarray((new_tex * 255).clip(0, 255).astype(np.uint8))
+    mesh_t.visual = TextureVisuals(uv=uvs, material=PBRMaterial(baseColorTexture=new_pil))
+    if geom_name and isinstance(scene_t, trimesh.Scene):
+        scene_t.geometry[geom_name] = mesh_t
+        scene_t.export(out_glb)
+    else:
+        mesh_t.export(out_glb)
+    print(f'[fib] Saved: {out_glb}  ({os.path.getsize(out_glb)//1024} KB)')
+    return out_glb
+if __name__ == '__main__':
+    ap = argparse.ArgumentParser()
+    ap.add_argument('--body',      required=True)
+    ap.add_argument('--face',      required=True)
+    ap.add_argument('--out',       required=True)
+    ap.add_argument('--uv_size',   type=int, default=4096)
+    ap.add_argument('--debug_dir', default=None)
+    args = ap.parse_args()
+    face_inswap_bake(args.body, args.face, args.out,
+                     uv_size=args.uv_size, debug_dir=args.debug_dir)

pipeline/face_project.py ADDED Viewed

	@@ -0,0 +1,305 @@

+"""
+face_project.py — Project reference face image onto TripoSG mesh UV texture.
+Keeps geometry 100% intact. Paints the face-region UV triangles using
+barycentric rasterization — never interpolates across UV island boundaries.
+Usage:
+    python face_project.py --body /tmp/triposg_textured.glb \
+                           --face /tmp/triposg_face_ref.png \
+                           --out  /tmp/face_projected.glb \
+                           [--blend 0.9] [--neck_frac 0.84] [--debug_tex /tmp/tex.png]
+"""
+import os, argparse, warnings
+warnings.filterwarnings('ignore')
+import numpy as np
+import cv2
+from PIL import Image
+import trimesh
+from trimesh.visual.texture import TextureVisuals
+from trimesh.visual.material import PBRMaterial
+# ── Face alignment ─────────────────────────────────────────────────────────────
+def _aligned_face_bgr(face_img_bgr, target_size=512):
+    """Detect + align face via InsightFace 5-pt warp; falls back to square crop."""
+    try:
+        from insightface.app import FaceAnalysis
+        from insightface.utils import face_align
+        app = FaceAnalysis(providers=['CPUExecutionProvider'])
+        app.prepare(ctx_id=0, det_size=(640, 640))
+        faces = app.get(face_img_bgr)
+        if faces:
+            faces.sort(
+                key=lambda f: (f.bbox[2]-f.bbox[0]) * (f.bbox[3]-f.bbox[1]),
+                reverse=True)
+            aligned = face_align.norm_crop(face_img_bgr, faces[0].kps,
+                                           image_size=target_size)
+            print(f'  InsightFace aligned: {aligned.shape}')
+            return aligned
+    except Exception as e:
+        print(f'  InsightFace unavailable ({e}), using centre-crop')
+    h, w = face_img_bgr.shape[:2]
+    side = min(h, w)
+    y0, x0 = (h - side) // 2, (w - side) // 2
+    return cv2.resize(face_img_bgr[y0:y0+side, x0:x0+side], (target_size, target_size))
+# ── Triangle rasterizer ────────────────────────────────────────────────────────
+def _rasterize_triangles(face_tri_uvs_px, face_tri_img_xy,
+                          face_img_rgb, tex, blend,
+                          max_uv_span=300):
+    """
+    Paint face_img_rgb colour into tex at UV locations, triangle by triangle.
+    face_tri_uvs_px  : (M, 3, 2)  UV pixel coords of M triangles
+    face_tri_img_xy  : (M, 3, 2)  projected image coords of M triangles
+    face_img_rgb     : (H, W, 3)  reference face image
+    tex              : (texH, texW, 3) float32 texture (modified in-place)
+    blend            : float 0–1
+    max_uv_span      : skip triangles whose UV bounding box exceeds this (UV seams)
+    """
+    H_f, W_f   = face_img_rgb.shape[:2]
+    tex_H, tex_W = tex.shape[:2]
+    painted = 0
+    for fi in range(len(face_tri_uvs_px)):
+        uv  = face_tri_uvs_px[fi]    # (3, 2) in texture pixel coords
+        img = face_tri_img_xy[fi]    # (3, 2) in face-image pixel coords
+        # Skip UV-seam triangles (vertices far apart in UV space)
+        if (uv[:, 0].max() - uv[:, 0].min() > max_uv_span or
+                uv[:, 1].max() - uv[:, 1].min() > max_uv_span):
+            continue
+        # Bounding box in texture space
+        u_lo = max(0,      int(uv[:, 0].min()))
+        u_hi = min(tex_W,  int(uv[:, 0].max()) + 2)
+        v_lo = max(0,      int(uv[:, 1].min()))
+        v_hi = min(tex_H,  int(uv[:, 1].max()) + 2)
+        if u_hi <= u_lo or v_hi <= v_lo:
+            continue
+        # Grid of texel centres in this bounding box
+        gu, gv = np.meshgrid(np.arange(u_lo, u_hi), np.arange(v_lo, v_hi))
+        pts = np.column_stack([gu.ravel().astype(np.float32),
+                               gv.ravel().astype(np.float32)])  # (K, 2)
+        # Barycentric coordinates (in UV pixel space)
+        A   = uv[0].astype(np.float64)
+        AB  = (uv[1] - uv[0]).astype(np.float64)
+        AC  = (uv[2] - uv[0]).astype(np.float64)
+        denom = AB[0] * AC[1] - AB[1] * AC[0]
+        if abs(denom) < 0.5:
+            continue
+        P  = pts.astype(np.float64) - A
+        b1 = (P[:, 0] * AC[1] - P[:, 1] * AC[0]) / denom
+        b2 = (P[:, 1] * AB[0] - P[:, 0] * AB[1]) / denom
+        b0 = 1.0 - b1 - b2
+        inside = (b0 >= 0) & (b1 >= 0) & (b2 >= 0)
+        if not inside.any():
+            continue
+        # Interpolate reference-face image coordinates
+        ix_f = (b0[inside] * img[0, 0] +
+                b1[inside] * img[1, 0] +
+                b2[inside] * img[2, 0])
+        iy_f = (b0[inside] * img[0, 1] +
+                b1[inside] * img[1, 1] +
+                b2[inside] * img[2, 1])
+        valid = ((ix_f >= 0) & (ix_f < W_f) & (iy_f >= 0) & (iy_f < H_f))
+        if not valid.any():
+            continue
+        ix = np.clip(ix_f[valid].astype(int), 0, W_f - 1)
+        iy = np.clip(iy_f[valid].astype(int), 0, H_f - 1)
+        colours = face_img_rgb[iy, ix].astype(np.float32)   # (P, 3)
+        tu = pts[inside][valid, 0].astype(int)
+        tv = pts[inside][valid, 1].astype(int)
+        in_tex = (tu >= 0) & (tu < tex_W) & (tv >= 0) & (tv < tex_H)
+        tex[tv[in_tex], tu[in_tex]] = (
+            blend * colours[in_tex] +
+            (1.0 - blend) * tex[tv[in_tex], tu[in_tex]]
+        )
+        painted += int(in_tex.sum())
+    return painted
+# ── Main ───────────────────────────────────────────────────────────────────────
+def project_face(body_glb, face_img_path, out_glb,
+                 blend=0.90, neck_frac=0.84, debug_tex=None):
+    """
+    Project reference face onto TripoSG UV texture via per-triangle rasterization.
+    """
+    # ── Load mesh ─────────────────────────────────────────────────────────────
+    print(f'[face_project] Loading {body_glb}')
+    scene = trimesh.load(body_glb)
+    if isinstance(scene, trimesh.Scene):
+        geom_name = list(scene.geometry.keys())[0]
+        mesh = scene.geometry[geom_name]
+    else:
+        mesh = scene
+        geom_name = None
+    verts    = np.array(mesh.vertices,  dtype=np.float64)   # (N, 3)
+    faces    = np.array(mesh.faces,     dtype=np.int32)     # (F, 3)
+    uvs      = np.array(mesh.visual.uv, dtype=np.float64)   # (N, 2)
+    mat      = mesh.visual.material
+    orig_tex = np.array(mat.baseColorTexture, dtype=np.float32)  # (H, W, 3) RGB
+    tex_H, tex_W = orig_tex.shape[:2]
+    print(f'  {len(verts)} verts | {len(faces)} faces | texture {orig_tex.shape}')
+    # ── Identify face-region vertices ─────────────────────────────────────────
+    y_min, y_max = verts[:, 1].min(), verts[:, 1].max()
+    neck_y = float(y_min + (y_max - y_min) * neck_frac)
+    head_mask = verts[:, 1] > neck_y
+    head_idx  = np.where(head_mask)[0]
+    hv = verts[head_idx]
+    # Front half only (z >= median — face faces +Z)
+    z_med  = float(np.median(hv[:, 2]))
+    front  = hv[:, 2] >= z_med
+    if front.sum() < 30:
+        front = np.ones(len(hv), bool)
+    face_vert_idx = head_idx[front]         # indices into the full vertex array
+    # Build boolean mask for fast triangle selection
+    face_vert_mask = np.zeros(len(verts), bool)
+    face_vert_mask[face_vert_idx] = True
+    # Select triangles where ALL 3 vertices are in the face region
+    face_tri_mask = face_vert_mask[faces].all(axis=1)
+    face_tris     = faces[face_tri_mask]    # (M, 3)
+    print(f'  neck_y={neck_y:.4f} | head={len(head_idx)} '
+          f'| face-front={front.sum()} | face triangles={len(face_tris)}')
+    # ── Load and align reference face ─────────────────────────────────────────
+    print(f'[face_project] Reference face: {face_img_path}')
+    raw_bgr     = cv2.imread(face_img_path)
+    aligned_bgr = _aligned_face_bgr(raw_bgr, target_size=512)
+    aligned_rgb = cv2.cvtColor(aligned_bgr, cv2.COLOR_BGR2RGB).astype(np.float32)
+    H_f, W_f    = aligned_rgb.shape[:2]
+    # ── Compute face projection axes from actual face normal ─────────────────
+    fv  = verts[face_vert_idx]
+    # Average normal of the front-facing face triangles defines projection dir
+    face_tri_normals = np.array(mesh.face_normals)[face_tri_mask]
+    face_fwd = face_tri_normals.mean(axis=0)
+    face_fwd /= np.linalg.norm(face_fwd)
+    # Build orthonormal right/up axes in the face plane
+    world_up   = np.array([0., 1., 0.])
+    face_right = np.cross(face_fwd, world_up)
+    face_right /= np.linalg.norm(face_right)
+    face_up    = np.cross(face_right, face_fwd)
+    face_up    /= np.linalg.norm(face_up)
+    print(f'  Face normal: {face_fwd.round(3)}')
+    # Project face vertices onto local (right, up) plane
+    fv_centroid = fv.mean(axis=0)
+    fv_c  = fv - fv_centroid
+    lx    = fv_c @ face_right
+    ly    = fv_c @ face_up
+    x_span = float(lx.max() - lx.min())
+    y_span = float(ly.max() - ly.min())
+    # InsightFace norm_crop places eyes at ~37% from top of the 512px image.
+    # In 3D the eyes are ~78% up from neck → 28% above centroid.
+    # Shift the vertical origin up by 0.112*y_span so eye level → 37% in image.
+    cy_shift = 0.112 * y_span
+    pad = 0.10   # tighter crop so face features fill more of the image
+    def vert_to_img(v):
+        """Project 3D vertex to reference-face image using the face normal."""
+        c  = v - fv_centroid                          # (N, 3)
+        lx = c @ face_right
+        ly = c @ face_up
+        pu = lx / (x_span * (1 + 2*pad)) + 0.5
+        pv = -(ly - cy_shift) / (y_span * (1 + 2*pad)) + 0.5
+        return np.column_stack([pu * W_f, pv * H_f])  # (N, 2)
+    def vert_to_uv_px(v_idx):
+        """Convert vertex UV coords to texture pixel coordinates."""
+        uv = uvs[v_idx]
+        # trimesh loads GLB UV with (0,0)=bottom-left; flip V for image row
+        col = uv[:, 0] * tex_W
+        row = (1.0 - uv[:, 1]) * tex_H
+        return np.column_stack([col, row])   # (N, 2)
+    # Pre-compute image + UV pixel coords for every vertex
+    all_img_px = vert_to_img(verts)           # (N, 2)
+    all_uv_px  = vert_to_uv_px(np.arange(len(verts)))  # (N, 2)
+    # Gather per-triangle arrays
+    face_tri_uvs_px  = all_uv_px[face_tris]   # (M, 3, 2)
+    face_tri_img_xy  = all_img_px[face_tris]  # (M, 3, 2)
+    print(f'  UV pixel range: u={face_tri_uvs_px[:,:,0].min():.0f}→'
+          f'{face_tri_uvs_px[:,:,0].max():.0f} '
+          f'v={face_tri_uvs_px[:,:,1].min():.0f}→'
+          f'{face_tri_uvs_px[:,:,1].max():.0f}')
+    print(f'  Image coord range: x={face_tri_img_xy[:,:,0].min():.1f}→'
+          f'{face_tri_img_xy[:,:,0].max():.1f} '
+          f'y={face_tri_img_xy[:,:,1].min():.1f}→'
+          f'{face_tri_img_xy[:,:,1].max():.1f}')
+    # ── Rasterize face triangles into UV texture ──────────────────────────────
+    print(f'[face_project] Rasterizing {len(face_tris)} triangles into texture...')
+    new_tex = orig_tex.copy()
+    painted = _rasterize_triangles(
+        face_tri_uvs_px, face_tri_img_xy,
+        aligned_rgb, new_tex, blend,
+        max_uv_span=300
+    )
+    print(f'  Painted {painted} texels across {len(face_tris)} triangles')
+    # ── Save debug texture if requested ──────────────────────────────────────
+    if debug_tex:
+        dbg = np.clip(new_tex, 0, 255).astype(np.uint8)
+        Image.fromarray(dbg).save(debug_tex)
+        print(f'  Debug texture: {debug_tex}')
+    # ── Write modified texture back to mesh ───────────────────────────────────
+    new_pil     = Image.fromarray(np.clip(new_tex, 0, 255).astype(np.uint8))
+    new_mat     = PBRMaterial(baseColorTexture=new_pil)
+    mesh.visual = TextureVisuals(uv=uvs, material=new_mat)
+    os.makedirs(os.path.dirname(os.path.abspath(out_glb)), exist_ok=True)
+    if geom_name and isinstance(scene, trimesh.Scene):
+        scene.geometry[geom_name] = mesh
+        scene.export(out_glb)
+    else:
+        mesh.export(out_glb)
+    print(f'[face_project] Saved: {out_glb}  ({os.path.getsize(out_glb)//1024} KB)')
+    return out_glb
+# ── CLI ────────────────────────────────────────────────────────────────────────
+if __name__ == '__main__':
+    ap = argparse.ArgumentParser()
+    ap.add_argument('--body',      required=True)
+    ap.add_argument('--face',      required=True)
+    ap.add_argument('--out',       required=True)
+    ap.add_argument('--blend',     type=float, default=0.90)
+    ap.add_argument('--neck_frac', type=float, default=0.84)
+    ap.add_argument('--debug_tex', default=None)
+    args = ap.parse_args()
+    project_face(args.body, args.face, args.out,
+                 blend=args.blend, neck_frac=args.neck_frac,
+                 debug_tex=args.debug_tex)

pipeline/face_swap_render.py ADDED Viewed

	@@ -0,0 +1,293 @@

+"""
+face_swap_render.py — Paint reference face onto TripoSG UV texture using
+MV-Adapter's UV-baking pipeline.
+Pipeline:
+  1. Load mesh with same params as triposg_app.py render stage
+  2. Create orthographic camera matching render_front.png (azimuth=-90)
+  3. Detect face landmarks in render_front.png + reference photo via InsightFace
+  4. norm_crop reference → canonical 512×512 frontal face
+  5. Estimate 4-DOF similarity (canonical → render) and warpAffine
+     → produces face_on_render.png: reference face at correct render-space coords
+  6. uv_render_attr(images=face_on_render) → projects render image into UV space
+     No inverse transform, no scale mismatch — the render-space coordinate system
+     is shared between the camera projection and the UV lookup.
+  7. Blend projected face into original texture with geometry mask guard.
+  8. Save updated GLB
+Usage:
+    python face_swap_render.py \
+        --body  /tmp/triposg_textured.glb \
+        --face  /tmp/triposg_face_ref.png \
+        --render /tmp/render_front.png \
+        --out   /tmp/face_swapped.glb \
+        [--blend 0.93] [--uv_size 4096] [--debug_dir /tmp]
+"""
+import os, sys, argparse, warnings
+warnings.filterwarnings('ignore')
+import numpy as np
+import cv2
+import torch
+import torch.nn.functional as F
+from PIL import Image
+import trimesh
+from trimesh.visual.texture import TextureVisuals
+from trimesh.visual.material import PBRMaterial
+from insightface.utils import face_align as insightface_align
+sys.path.insert(0, '/root/MV-Adapter')
+from mvadapter.utils.mesh_utils import (
+    NVDiffRastContextWrapper, load_mesh, get_orthogonal_camera,
+)
+from mvadapter.utils.mesh_utils.uv import (
+    uv_precompute, uv_render_geometry, uv_render_attr,
+)
+from insightface.app import FaceAnalysis
+def _detect_largest_face(img_bgr, app):
+    faces = app.get(img_bgr)
+    if not faces:
+        return None
+    faces.sort(key=lambda f: (f.bbox[2]-f.bbox[0])*(f.bbox[3]-f.bbox[1]), reverse=True)
+    return faces[0]
+def _build_front_face_uv_mask(mesh_t, tex_H, tex_W, neck_frac=0.84):
+    """
+    Build a UV-space mask covering only the front-facing face triangles.
+    Excludes back-of-head, hair, and ears (lateral vertices).
+    """
+    verts  = np.array(mesh_t.vertices, dtype=np.float64)
+    faces  = np.array(mesh_t.faces,    dtype=np.int32)
+    uvs    = np.array(mesh_t.visual.uv, dtype=np.float64)
+    # Head vertices above neck
+    y_min, y_max = verts[:, 1].min(), verts[:, 1].max()
+    neck_y   = float(y_min + (y_max - y_min) * neck_frac)
+    head_idx = np.where(verts[:, 1] > neck_y)[0]
+    hv = verts[head_idx]
+    # Front half: z >= 40th percentile — generous to include jaw/cheek toward ears
+    # No lateral exclusion — it splits UV islands through the eyes/mouth → duplicates
+    z_thresh = float(np.percentile(hv[:, 2], 40))
+    front    = hv[:, 2] >= z_thresh
+    if front.sum() < 30:
+        front = np.ones(len(hv), bool)
+    face_vert_idx  = head_idx[front]
+    face_vert_mask = np.zeros(len(verts), bool)
+    face_vert_mask[face_vert_idx] = True
+    face_tri_mask = face_vert_mask[faces].all(axis=1)
+    face_tris     = faces[face_tri_mask]
+    print(f'  Geometry mask: {face_tri_mask.sum()} front-face triangles selected '
+          f'(neck_y={neck_y:.3f}, z_thresh={z_thresh:.3f})')
+    # Rasterize into UV-space mask (trimesh UV: y=0 is bottom-left → flip V)
+    geom_mask = np.zeros((tex_H, tex_W), dtype=np.float32)
+    pts_list = []
+    for tri in face_tris:
+        uv = uvs[tri]  # (3, 2)
+        px = uv[:, 0] * tex_W
+        py = (1.0 - uv[:, 1]) * tex_H
+        pts_list.append(np.column_stack([px, py]).astype(np.int32))
+    if pts_list:
+        cv2.fillPoly(geom_mask, pts_list, 1.0)
+    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
+    geom_mask = cv2.dilate(geom_mask, kernel, iterations=2)   # close intra-tri gaps
+    geom_mask = cv2.erode(geom_mask,  kernel, iterations=1)   # retreat from island edges
+    geom_mask = cv2.GaussianBlur(geom_mask, (31, 31), 8)      # soft transition
+    return geom_mask
+def face_swap_render(body_glb, face_img_path, render_img_path, out_glb,
+                     blend=0.93, uv_size=4096, neck_frac=0.76, debug_dir=None):
+    device = 'cuda'
+    # ── Step 1: Load mesh ─────────────────────────────────────────────────────
+    print(f'[fsr] Loading mesh: {body_glb}')
+    ctx = NVDiffRastContextWrapper(device=device, context_type='cuda')
+    mesh_mv = load_mesh(body_glb, rescale=True, device=device)
+    scene_t = trimesh.load(body_glb)
+    if isinstance(scene_t, trimesh.Scene):
+        geom_name = list(scene_t.geometry.keys())[0]
+        mesh_t = scene_t.geometry[geom_name]
+    else:
+        mesh_t = scene_t; geom_name = None
+    orig_tex = np.array(mesh_t.visual.material.baseColorTexture, dtype=np.float32) / 255.0
+    uvs = np.array(mesh_t.visual.uv, dtype=np.float64)
+    tex_H, tex_W = orig_tex.shape[:2]
+    print(f'  UV size: {tex_W}×{tex_H}')
+    # ── Step 1b: Geometry mask (front-face UV islands only) ───────────────────
+    print('[fsr] Building geometry front-face UV mask ...')
+    geom_uv_mask = _build_front_face_uv_mask(mesh_t, tex_H, tex_W, neck_frac)
+    # ── Step 2: Orthographic camera matching render_front.png ─────────────────
+    render_img = cv2.imread(render_img_path)
+    H_r, W_r = render_img.shape[:2]
+    print(f'  Render size: {W_r}×{H_r}')
+    camera = get_orthogonal_camera(
+        elevation_deg=[0], distance=[1.8],
+        left=-0.55, right=0.55, bottom=-0.55, top=0.55,
+        azimuth_deg=[-90], device=device,
+    )
+    print(f'[fsr] Precomputing UV geometry ({uv_size}×{uv_size}) ...')
+    uv_pre  = uv_precompute(ctx, mesh_mv, height=uv_size, width=uv_size)
+    uv_geom = uv_render_geometry(
+        ctx, mesh_mv, camera,
+        view_height=H_r, view_width=W_r,
+        uv_precompute_output=uv_pre,
+        compute_depth_grad=False,
+    )
+    # ── Step 3: Face landmark detection ───────────────────────────────────────
+    print('[fsr] Detecting face landmarks ...')
+    app = FaceAnalysis(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
+    app.prepare(ctx_id=0, det_size=(640, 640))
+    ref_bgr = cv2.imread(face_img_path)
+    render_face = _detect_largest_face(render_img, app)
+    if render_face is None:
+        raise RuntimeError(f'No face detected in render: {render_img_path}')
+    ref_face = _detect_largest_face(ref_bgr, app)
+    if ref_face is None:
+        raise RuntimeError(f'No face detected in reference: {face_img_path}')
+    render_kps = render_face.kps   # (5, 2)
+    ref_kps    = ref_face.kps
+    print(f'  render kps: x={render_kps[:,0].min():.0f}-{render_kps[:,0].max():.0f}'
+          f'  y={render_kps[:,1].min():.0f}-{render_kps[:,1].max():.0f}')
+    # ── Step 4: norm_crop → canonical 512×512 frontal face ───────────────────
+    CANONICAL_SIZE = 512
+    aligned_bgr = insightface_align.norm_crop(ref_bgr, ref_kps, image_size=CANONICAL_SIZE)
+    # Fixed ARCFACE 5-point positions scaled to CANONICAL_SIZE
+    ARCFACE_112 = np.array([
+        [38.2946, 51.6963],
+        [73.5318, 51.5014],
+        [56.0252, 71.7366],
+        [41.5493, 92.3655],
+        [70.7299, 92.2041],
+    ], dtype=np.float32)
+    canonical_kps = ARCFACE_112 * (CANONICAL_SIZE / 112.0)
+    # ── Step 5: Forward warp: canonical → render space ────────────────────────
+    # 4-DOF similarity (scale + rotation + translation) with all 5 kps.
+    # FORWARD direction: canonical_kps → render_kps so that warpAffine places
+    # the face at exactly the render-space coordinates, downsampling cleanly.
+    fwd_M, inliers = cv2.estimateAffinePartial2D(
+        canonical_kps.astype(np.float32),
+        render_kps.astype(np.float32),
+        method=cv2.LMEDS,
+    )
+    print(f'  Forward warp M:\n{fwd_M}')
+    face_on_render_bgr = cv2.warpAffine(
+        aligned_bgr, fwd_M, (W_r, H_r),
+        flags=cv2.INTER_LANCZOS4,
+        borderMode=cv2.BORDER_CONSTANT, borderValue=0,
+    )
+    face_on_render_rgb = cv2.cvtColor(face_on_render_bgr,
+                                      cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+    # ── Step 6: Render-space face hull mask ───────────────────────────────────
+    # Only paint UV texels that correspond to pixels inside the face region.
+    hull_pts = cv2.convexHull(render_kps.astype(np.float32)).squeeze(1)
+    hull_cx, hull_cy = hull_pts.mean(axis=0)
+    hull_expanded = (hull_pts - [hull_cx, hull_cy]) * 4.0 + [hull_cx, hull_cy]
+    face_mask_render = np.zeros((H_r, W_r), dtype=np.float32)
+    cv2.fillPoly(face_mask_render, [hull_expanded.astype(np.int32)], 1.0)
+    # Restrict to where the warped face actually has content
+    face_content = (face_on_render_bgr.mean(axis=2) > 3.0 / 255.0).astype(np.float32)
+    face_mask_render = face_mask_render * face_content
+    face_mask_render = cv2.GaussianBlur(face_mask_render, (51, 51), 15)
+    # ── Step 7: Project face-on-render into UV space ──────────────────────────
+    # uv_render_attr uses uv_pos_ndc as a lookup: for each UV texel, sample the
+    # render-space image at that texel's render NDC position.
+    # Since face_on_render is already in render-space coords, this is exact.
+    print('[fsr] Projecting face into UV space via uv_render_attr ...')
+    face_t = torch.tensor(face_on_render_rgb, device=device).unsqueeze(0)  # (1,H,W,3)
+    mask_t = torch.tensor(face_mask_render[None], device=device)
+    uv_attr_out = uv_render_attr(
+        images=face_t,
+        masks=mask_t,
+        uv_render_geometry_output=uv_geom,
+    )
+    uv_face_img  = uv_attr_out.uv_attr_proj[0].cpu().numpy()   # (uv, uv, 3)
+    uv_face_mask = uv_attr_out.uv_mask_proj[0].cpu().numpy()   # (uv, uv)
+    # Rescale to tex resolution if needed
+    if uv_size != tex_H or uv_size != tex_W:
+        uv_face_img_rs  = cv2.resize(uv_face_img,  (tex_W, tex_H), interpolation=cv2.INTER_LINEAR)
+        uv_face_mask_rs = cv2.resize(uv_face_mask, (tex_W, tex_H), interpolation=cv2.INTER_LINEAR)
+    else:
+        uv_face_img_rs  = uv_face_img
+        uv_face_mask_rs = uv_face_mask
+    # ── Step 7b: Apply geometry mask — kill back-of-head / ear UV islands ────
+    uv_face_mask_rs = uv_face_mask_rs * geom_uv_mask
+    # Final blend alpha — use full blend=1.0 inside the face region so no
+    # original texture leaks through and creates duplicate features
+    alpha = np.clip(uv_face_mask_rs, 0, 1)[..., None]
+    painted_px = int((alpha[..., 0] > 0.01).sum())
+    print(f'  Painted texels: {painted_px}')
+    if debug_dir:
+        cv2.imwrite(os.path.join(debug_dir, 'fsr_aligned_ref.png'), aligned_bgr)
+        cv2.imwrite(os.path.join(debug_dir, 'fsr_face_on_render.png'), face_on_render_bgr)
+        cv2.imwrite(os.path.join(debug_dir, 'fsr_face_mask_render.png'),
+                    (face_mask_render * 255).astype(np.uint8))
+        cv2.imwrite(os.path.join(debug_dir, 'fsr_geom_mask.png'),
+                    (geom_uv_mask * 255).astype(np.uint8))
+        cv2.imwrite(os.path.join(debug_dir, 'fsr_uv_mask.png'),
+                    (uv_face_mask_rs * 255).astype(np.uint8))
+        Image.fromarray((uv_face_img_rs * 255).clip(0, 255).astype(np.uint8)).save(
+            os.path.join(debug_dir, 'fsr_uv_face.png'))
+        print(f'  Debug files saved to {debug_dir}')
+    # ── Step 8: Blend into original texture ───────────────────────────────────
+    print(f'[fsr] Blending (blend={blend}) ...')
+    new_tex = uv_face_img_rs * alpha + orig_tex * (1.0 - alpha)
+    # ── Step 9: Save GLB ──────────────────────────────────────────────────────
+    new_pil = Image.fromarray((new_tex * 255).clip(0, 255).astype(np.uint8))
+    mesh_t.visual = TextureVisuals(uv=uvs, material=PBRMaterial(baseColorTexture=new_pil))
+    if geom_name and isinstance(scene_t, trimesh.Scene):
+        scene_t.geometry[geom_name] = mesh_t
+        scene_t.export(out_glb)
+    else:
+        mesh_t.export(out_glb)
+    print(f'[fsr] Saved: {out_glb}  ({os.path.getsize(out_glb)//1024} KB)')
+    return out_glb
+if __name__ == '__main__':
+    ap = argparse.ArgumentParser()
+    ap.add_argument('--body',      required=True)
+    ap.add_argument('--face',      required=True)
+    ap.add_argument('--render',    required=True, help='Front render (e.g. render_front.png)')
+    ap.add_argument('--out',       required=True)
+    ap.add_argument('--blend',     type=float, default=0.93)
+    ap.add_argument('--uv_size',   type=int,   default=4096)
+    ap.add_argument('--neck_frac', type=float, default=0.76)
+    ap.add_argument('--debug_dir', default=None)
+    args = ap.parse_args()
+    face_swap_render(args.body, args.face, args.render, args.out,
+                     blend=args.blend, uv_size=args.uv_size,
+                     neck_frac=args.neck_frac, debug_dir=args.debug_dir)

pipeline/face_transplant.py ADDED Viewed

	@@ -0,0 +1,667 @@

+"""
+face_transplant.py
+==================
+Replace the face/head region of a rigged UniRig GLB with a higher-detail
+PSHuman mesh, while preserving the skeleton, rig, and skinning weights.
+Algorithm
+---------
+1. Parse rigged GLB  → vertices, faces, UVs, JOINTS_0, WEIGHTS_0, bone list
+2. Identify head vertices  →  any vert whose dominant bone is in HEAD_BONES
+3. Load PSHuman mesh (OBJ or GLB, no rig)
+4. Align PSHuman head to UniRig head bounding box  (scale + translate)
+5. Transfer skinning weights to PSHuman verts via K-nearest-neighbour from
+   UniRig head verts  (scipy KDTree, weighted average)
+6. Retract UniRig face verts slightly inward so PSHuman sits on top cleanly
+7. Rebuild the GLB with two mesh primitives:
+     - Primitive 0  : UniRig body  (face verts retracted)
+     - Primitive 1  : PSHuman face (new, with transferred weights)
+8. Write output GLB
+Usage
+-----
+python -m pipeline.face_transplant \\
+    --body   rigged_body.glb \\
+    --face   pshuman_output.obj \\
+    --output rigged_body_with_pshuman_face.glb
+Optionally supply --head-bones as comma-separated bone-name substrings
+(default: head,Head,skull).  Any bone whose name contains one of these
+substrings is treated as a head bone.
+Requires:  pygltflib  numpy  scipy  trimesh  (pip install each)
+"""
+from __future__ import annotations
+import argparse
+import base64
+import struct
+import json
+from pathlib import Path
+from typing import Dict, List, Optional, Tuple
+import numpy as np
+from scipy.spatial import KDTree
+import trimesh
+# ---------------------------------------------------------------------------
+# GLB low-level helpers (subset of Retarget/io/gltf_io.py re-used here)
+# ---------------------------------------------------------------------------
+try:
+    import pygltflib
+except ImportError:
+    raise ImportError("pip install pygltflib")
+def _read_accessor_raw(gltf: pygltflib.GLTF2, accessor_idx: int) -> np.ndarray:
+    acc  = gltf.accessors[accessor_idx]
+    bv   = gltf.bufferViews[acc.bufferView]
+    buf  = gltf.buffers[bv.buffer]
+    if buf.uri and buf.uri.startswith("data:"):
+        _, b64 = buf.uri.split(",", 1)
+        raw = base64.b64decode(b64)
+    elif buf.uri:
+        base_dir = Path(gltf._path).parent if getattr(gltf, "_path", None) else Path(".")
+        raw = (base_dir / buf.uri).read_bytes()
+    else:
+        raw = bytes(gltf.binary_blob())
+    type_nc = {"SCALAR": 1, "VEC2": 2, "VEC3": 3, "VEC4": 4, "MAT4": 16}
+    fmt_map  = {5120: "b", 5121: "B", 5122: "h", 5123: "H", 5125: "I", 5126: "f"}
+    n_comp   = type_nc[acc.type]
+    fmt      = fmt_map[acc.componentType]
+    item_sz  = struct.calcsize(fmt) * n_comp
+    stride   = bv.byteStride or item_sz
+    start    = bv.byteOffset + (acc.byteOffset or 0)
+    items = []
+    for i in range(acc.count):
+        offset = start + i * stride
+        vals = struct.unpack_from(f"{n_comp}{fmt}", raw, offset)
+        items.append(vals)
+    arr = np.array(items)
+    if arr.ndim == 2 and arr.shape[1] == 1:
+        arr = arr[:, 0]
+    return arr
+def _accessor_dtype(gltf: pygltflib.GLTF2, accessor_idx: int):
+    fmt_map = {5120: np.int8, 5121: np.uint8, 5122: np.int16, 5123: np.uint16,
+               5125: np.uint32, 5126: np.float32}
+    return fmt_map[gltf.accessors[accessor_idx].componentType]
+# ---------------------------------------------------------------------------
+# Mesh extraction
+# ---------------------------------------------------------------------------
+class GLBMesh:
+    """
+    All data from the first skin's first mesh primitive in a GLB.
+    """
+    def __init__(self, path: str):
+        self.path = path
+        gltf = pygltflib.GLTF2().load(path)
+        gltf._path = path
+        self.gltf = gltf
+        if not gltf.skins:
+            raise ValueError("No skin found in GLB — is this a rigged file?")
+        self.skin = gltf.skins[0]
+        self.joint_names: List[str] = [gltf.nodes[j].name or f"joint_{k}"
+                                        for k, j in enumerate(self.skin.joints)]
+        # Find a mesh node that uses this skin
+        self.mesh_prim, self.mesh_node_idx = self._find_skinned_prim()
+        attrs = self.mesh_prim.attributes
+        self.verts   = _read_accessor_raw(gltf, attrs.POSITION).astype(np.float32)
+        self.normals = (_read_accessor_raw(gltf, attrs.NORMAL).astype(np.float32)
+                        if attrs.NORMAL is not None else None)
+        self.uvs     = (_read_accessor_raw(gltf, attrs.TEXCOORD_0).astype(np.float32)
+                        if attrs.TEXCOORD_0 is not None else None)
+        self.faces   = _read_accessor_raw(gltf, self.mesh_prim.indices).astype(np.int32).reshape(-1, 3)
+        # Skinning — may be JOINTS_0 / WEIGHTS_0 (uint8/uint16 + float)
+        self.joints4  = None
+        self.weights4 = None
+        if attrs.JOINTS_0 is not None:
+            self.joints4  = _read_accessor_raw(gltf, attrs.JOINTS_0).astype(np.int32)
+            self.weights4 = _read_accessor_raw(gltf, attrs.WEIGHTS_0).astype(np.float32)
+        # Material index (carry over to output)
+        self.material_idx = self.mesh_prim.material
+    def _find_skinned_prim(self):
+        skin_node_indices = set(self.skin.joints)
+        # find mesh node that references this skin
+        for ni, node in enumerate(self.gltf.nodes):
+            if node.skin == 0 and node.mesh is not None:
+                mesh = self.gltf.meshes[node.mesh]
+                return mesh.primitives[0], ni
+        # fallback: first mesh node
+        for ni, node in enumerate(self.gltf.nodes):
+            if node.mesh is not None:
+                mesh = self.gltf.meshes[node.mesh]
+                return mesh.primitives[0], ni
+        raise ValueError("No mesh primitive found")
+    def head_bone_indices(self, substrings=("head", "Head", "skull", "Skull", "neck", "Neck")) -> List[int]:
+        """Return joint indices (into self.joint_names) matching any substring.
+        Falls back to positional heuristic (highest-Y dominant bone) when no
+        bone names match (e.g. generic bone_0/bone_1 naming from UniRig)."""
+        result = []
+        for i, name in enumerate(self.joint_names):
+            if any(s in name for s in substrings):
+                result.append(i)
+        if not result and self.joints4 is not None and self.weights4 is not None:
+            # Positional fallback: pick bone whose dominant vertices have highest avg Y.
+            n_bones = len(self.joint_names)
+            bone_y_sum = np.zeros(n_bones)
+            bone_y_cnt = np.zeros(n_bones, dtype=np.int32)
+            for vi in range(len(self.verts)):
+                dom = int(self.joints4[vi, np.argmax(self.weights4[vi])])
+                bone_y_sum[dom] += self.verts[vi, 1]
+                bone_y_cnt[dom] += 1
+            with np.errstate(invalid='ignore'):
+                bone_y_avg = np.where(bone_y_cnt > 0, bone_y_sum / bone_y_cnt, -np.inf)
+            top = int(np.argmax(bone_y_avg))
+            print(f"[face_transplant] No named head bones; positional fallback: "
+                  f"bone {top} ({self.joint_names[top]}, avg_y={bone_y_avg[top]:.3f})")
+            result = [top]
+        return result
+# ---------------------------------------------------------------------------
+# Face-region identification
+# ---------------------------------------------------------------------------
+def find_face_verts(glb_mesh: GLBMesh, head_joint_indices: List[int],
+                    weight_threshold: float = 0.35) -> np.ndarray:
+    """
+    Return boolean mask of face/head vertices:
+    any vert whose total weight on head joints exceeds weight_threshold.
+    """
+    if glb_mesh.joints4 is None:
+        raise ValueError("Mesh has no skinning weights — cannot identify face region")
+    n = len(glb_mesh.verts)
+    mask = np.zeros(n, dtype=bool)
+    head_set = set(head_joint_indices)
+    for vi in range(n):
+        total_head_w = 0.0
+        for c in range(4):
+            j = glb_mesh.joints4[vi, c]
+            w = glb_mesh.weights4[vi, c]
+            if j in head_set:
+                total_head_w += w
+        if total_head_w >= weight_threshold:
+            mask[vi] = True
+    return mask
+# ---------------------------------------------------------------------------
+# PSHuman mesh loading + alignment
+# ---------------------------------------------------------------------------
+def _crop_to_head(mesh: trimesh.Trimesh, head_fraction: float = 0.22) -> trimesh.Trimesh:
+    """
+    Keep only the top head_fraction of the PSHuman body mesh by Y coordinate.
+    PSHuman produces a full-body mesh; we only want the head/face portion.
+    """
+    y = mesh.vertices[:, 1]
+    threshold = y.max() - (y.max() - y.min()) * head_fraction
+    vert_keep = y >= threshold
+    face_keep = vert_keep[mesh.faces].all(axis=1)
+    kept_faces = mesh.faces[face_keep]
+    used = np.unique(kept_faces)
+    remap = np.full(len(mesh.vertices), -1, dtype=np.int32)
+    remap[used] = np.arange(len(used))
+    new_verts = mesh.vertices[used].astype(np.float32)
+    new_faces = remap[kept_faces]
+    result = trimesh.Trimesh(vertices=new_verts, faces=new_faces, process=False)
+    if hasattr(mesh.visual, 'uv') and mesh.visual.uv is not None:
+        result.visual = trimesh.visual.TextureVisuals(uv=np.array(mesh.visual.uv)[used])
+    print(f"[face_transplant] PSHuman head crop ({head_fraction*100:.0f}%): "
+          f"{len(mesh.vertices)} → {len(new_verts)} verts  (Y ≥ {threshold:.3f})")
+    return result
+def load_and_align_pshuman(pshuman_path: str, target_verts: np.ndarray) -> trimesh.Trimesh:
+    """
+    Load PSHuman mesh (OBJ/GLB/PLY), crop to head region, then scale+translate
+    to fit the bounding box of target_verts (UniRig head verts).
+    """
+    mesh: trimesh.Trimesh = trimesh.load(pshuman_path, force="mesh", process=False)
+    print(f"[face_transplant] PSHuman mesh: {len(mesh.vertices)} verts, {len(mesh.faces)} faces")
+    # PSHuman is full-body — crop to just the head before aligning
+    mesh = _crop_to_head(mesh)
+    # Target bbox from UniRig head region
+    tgt_min = target_verts.min(axis=0)
+    tgt_max = target_verts.max(axis=0)
+    tgt_ctr = (tgt_min + tgt_max) * 0.5
+    tgt_ext = (tgt_max - tgt_min)
+    src_min = mesh.vertices.min(axis=0).astype(np.float32)
+    src_max = mesh.vertices.max(axis=0).astype(np.float32)
+    src_ctr = (src_min + src_max) * 0.5
+    src_ext = (src_max - src_min)
+    # Uniform scale: match the largest axis of the target
+    dominant = np.argmax(tgt_ext)
+    scale = float(tgt_ext[dominant]) / float(src_ext[dominant] + 1e-9)
+    verts = mesh.vertices.astype(np.float32).copy()
+    verts = (verts - src_ctr) * scale + tgt_ctr
+    mesh.vertices = verts
+    print(f"[face_transplant] PSHuman aligned: scale={scale:.4f}, center={tgt_ctr}")
+    return mesh
+# ---------------------------------------------------------------------------
+# Weight transfer via KDTree
+# ---------------------------------------------------------------------------
+def transfer_weights(
+    donor_verts: np.ndarray,      # (M, 3) UniRig face verts
+    donor_joints: np.ndarray,     # (M, 4) uint16
+    donor_weights: np.ndarray,    # (M, 4) float32
+    recipient_verts: np.ndarray,  # (N, 3) PSHuman face verts
+    k: int = 5,
+) -> Tuple[np.ndarray, np.ndarray]:
+    """
+    K-nearest-neighbour weight transfer.
+    Returns (joints4, weights4) for recipient_verts.
+    """
+    tree = KDTree(donor_verts)
+    dists, idxs = tree.query(recipient_verts, k=k)  # (N, k)
+    N = len(recipient_verts)
+    n_joints_total = int(donor_joints.max()) + 1
+    # Build dense per-recipient weight vector
+    dense = np.zeros((N, n_joints_total), dtype=np.float64)
+    for ki in range(k):
+        w_dist = 1.0 / (dists[:, ki] + 1e-8)  # inverse-distance
+        for vi in range(N):
+            di = idxs[vi, ki]
+            for c in range(4):
+                j = donor_joints[di, c]
+                w = donor_weights[di, c]
+                dense[vi, j] += w * w_dist[vi]
+    # Re-normalise rows
+    row_sum = dense.sum(axis=1, keepdims=True) + 1e-12
+    dense /= row_sum
+    # Pack back into 4-bone format (top-4 by weight)
+    out_joints  = np.zeros((N, 4), dtype=np.uint16)
+    out_weights = np.zeros((N, 4), dtype=np.float32)
+    for vi in range(N):
+        top4 = np.argsort(dense[vi])[-4:][::-1]
+        total = dense[vi, top4].sum() + 1e-12
+        for c, j in enumerate(top4):
+            out_joints[vi, c]  = j
+            out_weights[vi, c] = dense[vi, j] / total
+    return out_joints, out_weights
+# ---------------------------------------------------------------------------
+# GLB rebuild
+# ---------------------------------------------------------------------------
+def _pack_buffer_view(data_bytes: bytes, target: list, byte_offset: int,
+                      byte_stride: Optional[int] = None) -> Tuple[int, int]:
+    """
+    Append data_bytes to target buffer, return (buffer_view_index, new_offset).
+    """
+    bv = pygltflib.BufferView(
+        buffer=0,
+        byteOffset=byte_offset,
+        byteLength=len(data_bytes),
+    )
+    if byte_stride:
+        bv.byteStride = byte_stride
+    return bv, byte_offset + len(data_bytes)
+def _make_accessor(component_type: int, type_str: str, count: int,
+                   bv_idx: int, min_vals=None, max_vals=None) -> pygltflib.Accessor:
+    acc = pygltflib.Accessor(
+        bufferView=bv_idx,
+        byteOffset=0,
+        componentType=component_type,
+        count=count,
+        type=type_str,
+    )
+    if min_vals is not None:
+        acc.min = [float(v) for v in min_vals]
+    if max_vals is not None:
+        acc.max = [float(v) for v in max_vals]
+    return acc
+FLOAT32 = pygltflib.FLOAT        # 5126
+UINT16  = pygltflib.UNSIGNED_SHORT  # 5123
+UINT32  = pygltflib.UNSIGNED_INT    # 5125
+UBYTE   = pygltflib.UNSIGNED_BYTE   # 5121
+def transplant_face(
+    body_glb_path: str,
+    pshuman_mesh_path: str,
+    output_path: str,
+    head_bone_substrings: Tuple[str, ...] = ("head", "Head", "skull", "Skull"),
+    weight_threshold: float = 0.35,
+    retract_amount: float = 0.004,   # metres — how far to push face verts inward
+    knn: int = 5,
+):
+    """
+    Main entry point.
+    Parameters
+    ----------
+    body_glb_path          : rigged UniRig GLB
+    pshuman_mesh_path      : PSHuman output mesh (OBJ / GLB / PLY)
+    output_path            : result GLB path
+    head_bone_substrings   : bone name fragments that identify head joints
+    weight_threshold       : head-weight sum above which a vertex is "face"
+    retract_amount         : metres to push face verts inward to avoid z-fight
+    knn                    : neighbours for weight transfer
+    """
+    print(f"[face_transplant] Loading rigged GLB: {body_glb_path}")
+    glb = GLBMesh(body_glb_path)
+    print(f"  Verts: {len(glb.verts)}  Faces: {len(glb.faces)}")
+    print(f"  Bones ({len(glb.joint_names)}): {', '.join(glb.joint_names[:8])} ...")
+    # 1. Identify head joints
+    head_ji = glb.head_bone_indices(substrings=head_bone_substrings)
+    if not head_ji:
+        raise RuntimeError(
+            f"No head bones found with substrings {head_bone_substrings}.\n"
+            f"Available bones: {glb.joint_names}"
+        )
+    print(f"  Head joints ({len(head_ji)}): {[glb.joint_names[i] for i in head_ji]}")
+    # 2. Find face/head vertices
+    face_mask = find_face_verts(glb, head_ji, weight_threshold=weight_threshold)
+    print(f"  Face verts: {face_mask.sum()} / {len(glb.verts)}")
+    min_face_verts = max(3, min(10, len(glb.verts) // 4))
+    if face_mask.sum() < min_face_verts:
+        raise RuntimeError(
+            f"Only {face_mask.sum()} face vertices found (need >= {min_face_verts}) — "
+            f"try lowering --weight-threshold (current: {weight_threshold})"
+        )
+    # 3. Load + align PSHuman mesh
+    face_verts_ur = glb.verts[face_mask]
+    ps_mesh = load_and_align_pshuman(pshuman_mesh_path, face_verts_ur)
+    ps_verts  = np.array(ps_mesh.vertices, dtype=np.float32)
+    ps_faces  = np.array(ps_mesh.faces, dtype=np.int32)
+    ps_uvs    = None
+    if hasattr(ps_mesh.visual, "uv") and ps_mesh.visual.uv is not None:
+        ps_uvs = np.array(ps_mesh.visual.uv, dtype=np.float32)
+    # 4. Transfer weights: donor = UniRig face verts, recipient = PSHuman verts
+    print("[face_transplant] Transferring skinning weights via KNN ...")
+    ps_joints, ps_weights = transfer_weights(
+        donor_verts   = glb.verts[face_mask].astype(np.float64),
+        donor_joints  = glb.joints4[face_mask],
+        donor_weights = glb.weights4[face_mask],
+        recipient_verts = ps_verts.astype(np.float64),
+        k = knn,
+    )
+    print(f"  Done. Head joint coverage in PSHuman: "
+          f"{(np.isin(ps_joints[:, 0], head_ji)).mean() * 100:.1f}% primary bone is head")
+    # 5. Retract UniRig face verts inward (push along −normal)
+    body_verts = glb.verts.copy()
+    if glb.normals is not None:
+        body_verts[face_mask] -= glb.normals[face_mask] * retract_amount
+    else:
+        # push toward centroid
+        centroid = body_verts[face_mask].mean(axis=0)
+        dirs = centroid - body_verts[face_mask]
+        norms = np.linalg.norm(dirs, axis=1, keepdims=True) + 1e-9
+        body_verts[face_mask] += (dirs / norms) * retract_amount
+    # 6. Rebuild GLB
+    print("[face_transplant] Rebuilding GLB ...")
+    _write_transplanted_glb(
+        source_gltf      = glb,
+        body_verts       = body_verts,
+        ps_verts         = ps_verts,
+        ps_faces         = ps_faces,
+        ps_uvs           = ps_uvs,
+        ps_joints        = ps_joints,
+        ps_weights       = ps_weights,
+        output_path      = output_path,
+    )
+    print(f"[face_transplant] Saved -> {output_path}")
+# ---------------------------------------------------------------------------
+# GLB writer
+# ---------------------------------------------------------------------------
+def _write_transplanted_glb(
+    source_gltf: GLBMesh,
+    body_verts: np.ndarray,
+    ps_verts: np.ndarray,
+    ps_faces: np.ndarray,
+    ps_uvs: Optional[np.ndarray],
+    ps_joints: np.ndarray,
+    ps_weights: np.ndarray,
+    output_path: str,
+):
+    """
+    Copy the source GLB structure, replace mesh primitive 0 vertex data,
+    and append a new primitive for the PSHuman face.
+    """
+    import copy
+    gltf = pygltflib.GLTF2().load(source_gltf.path)
+    gltf._path = source_gltf.path
+    # ------------------------------------------------------------------
+    # Preserve embedded images as data URIs BEFORE we wipe buffer views.
+    # The binary blob rebuild below only contains geometry; any image data
+    # referenced via bufferView would otherwise be lost.
+    # ------------------------------------------------------------------
+    try:
+        blob = bytes(gltf.binary_blob())
+    except Exception:
+        blob = b""
+    for img in gltf.images:
+        if img.bufferView is not None and img.uri is None and blob:
+            bv = gltf.bufferViews[img.bufferView]
+            img_bytes = blob[bv.byteOffset: bv.byteOffset + bv.byteLength]
+            mime = img.mimeType or "image/png"
+            img.uri = "data:{};base64,{}".format(mime, base64.b64encode(img_bytes).decode())
+            img.bufferView = None
+    # ------------------------------------------------------------------
+    # We will rebuild the entire binary buffer from scratch.
+    # Collect all data chunks; track buffer views + accessors.
+    # ------------------------------------------------------------------
+    chunks: List[bytes]  = []
+    bviews: List[pygltflib.BufferView] = []
+    accors: List[pygltflib.Accessor]   = []
+    byte_offset = 0
+    def add_chunk(data: bytes, component_type: int, type_str: str, count: int,
+                  min_v=None, max_v=None, stride: int = None) -> int:
+        """Append data, create buffer view + accessor, return accessor index."""
+        nonlocal byte_offset
+        bv = pygltflib.BufferView(buffer=0, byteOffset=byte_offset, byteLength=len(data))
+        if stride:
+            bv.byteStride = stride
+        bviews.append(bv)
+        bv_idx = len(bviews) - 1
+        acc = pygltflib.Accessor(
+            bufferView=bv_idx,
+            byteOffset=0,
+            componentType=component_type,
+            count=count,
+            type=type_str,
+        )
+        if min_v is not None:
+            acc.min = [float(x) for x in np.atleast_1d(min_v)]
+        if max_v is not None:
+            acc.max = [float(x) for x in np.atleast_1d(max_v)]
+        accors.append(acc)
+        acc_idx = len(accors) - 1
+        chunks.append(data)
+        byte_offset += len(data)
+        return acc_idx
+    # ------------------------------------------------------------------
+    # Primitive 0 — UniRig body (retracted face verts)
+    # ------------------------------------------------------------------
+    body_v  = body_verts.astype(np.float32)
+    body_i  = source_gltf.faces.astype(np.uint32).flatten()
+    body_n  = (source_gltf.normals.astype(np.float32)
+               if source_gltf.normals is not None else None)
+    body_uv = (source_gltf.uvs.astype(np.float32)
+               if source_gltf.uvs is not None else None)
+    body_j  = source_gltf.joints4.astype(np.uint16)
+    body_w  = source_gltf.weights4.astype(np.float32)
+    # indices
+    bi_idx = add_chunk(body_i.tobytes(), UINT32, "SCALAR", len(body_i),
+                       min_v=[int(body_i.min())], max_v=[int(body_i.max())])
+    # positions
+    bv_idx = add_chunk(body_v.tobytes(), FLOAT32, "VEC3", len(body_v),
+                       min_v=body_v.min(axis=0), max_v=body_v.max(axis=0))
+    body_attrs = pygltflib.Attributes(POSITION=bv_idx)
+    if body_n is not None:
+        body_attrs.NORMAL = add_chunk(body_n.tobytes(), FLOAT32, "VEC3", len(body_n))
+    if body_uv is not None:
+        body_attrs.TEXCOORD_0 = add_chunk(body_uv.tobytes(), FLOAT32, "VEC2", len(body_uv))
+    if body_j is not None:
+        body_attrs.JOINTS_0  = add_chunk(body_j.tobytes(), UINT16,  "VEC4", len(body_j))
+        body_attrs.WEIGHTS_0 = add_chunk(body_w.tobytes(), FLOAT32, "VEC4", len(body_w))
+    prim0 = pygltflib.Primitive(
+        attributes=body_attrs,
+        indices=bi_idx,
+        material=source_gltf.material_idx,
+        mode=4,  # TRIANGLES
+    )
+    # ------------------------------------------------------------------
+    # Primitive 1 — PSHuman face
+    # ------------------------------------------------------------------
+    ps_v   = ps_verts.astype(np.float32)
+    ps_i   = ps_faces.astype(np.uint32).flatten()
+    ps_j4  = ps_joints.astype(np.uint16)
+    ps_w4  = ps_weights.astype(np.float32)
+    # PSHuman material — reuse body material for now (same texture look)
+    # If PSHuman has its own texture, you'd add a new material here.
+    face_mat_idx = source_gltf.material_idx
+    fi_idx = add_chunk(ps_i.tobytes(), UINT32, "SCALAR", len(ps_i),
+                       min_v=[int(ps_i.min())], max_v=[int(ps_i.max())])
+    fv_idx = add_chunk(ps_v.tobytes(), FLOAT32, "VEC3", len(ps_v),
+                       min_v=ps_v.min(axis=0), max_v=ps_v.max(axis=0))
+    face_attrs = pygltflib.Attributes(POSITION=fv_idx)
+    if ps_uvs is not None:
+        face_attrs.TEXCOORD_0 = add_chunk(ps_uvs.tobytes(), FLOAT32, "VEC2", len(ps_uvs))
+    face_attrs.JOINTS_0  = add_chunk(ps_j4.tobytes(), UINT16,  "VEC4", len(ps_j4))
+    face_attrs.WEIGHTS_0 = add_chunk(ps_w4.tobytes(), FLOAT32, "VEC4", len(ps_w4))
+    prim1 = pygltflib.Primitive(
+        attributes=face_attrs,
+        indices=fi_idx,
+        material=face_mat_idx,
+        mode=4,
+    )
+    # ------------------------------------------------------------------
+    # Patch gltf structure
+    # ------------------------------------------------------------------
+    # Find or create the mesh that uses our skin
+    mesh_node = gltf.nodes[source_gltf.mesh_node_idx]
+    old_mesh_idx = mesh_node.mesh
+    new_mesh = pygltflib.Mesh(
+        name="body_with_pshuman_face",
+        primitives=[prim0, prim1],
+    )
+    # Replace or append
+    if old_mesh_idx is not None and old_mesh_idx < len(gltf.meshes):
+        gltf.meshes[old_mesh_idx] = new_mesh
+        target_mesh_idx = old_mesh_idx
+    else:
+        gltf.meshes.append(new_mesh)
+        target_mesh_idx = len(gltf.meshes) - 1
+    mesh_node.mesh = target_mesh_idx
+    # Replace buffer views and accessors
+    gltf.bufferViews = bviews
+    gltf.accessors   = accors
+    # Rewrite buffer
+    combined = b"".join(chunks)
+    # Pad to 4-byte alignment
+    if len(combined) % 4:
+        combined += b"\x00" * (4 - len(combined) % 4)
+    gltf.buffers = [pygltflib.Buffer(byteLength=len(combined))]
+    gltf.set_binary_blob(combined)
+    # Drop stale animation (it referenced old accessor indices)
+    # The user can re-add animation later if needed.
+    gltf.animations = []
+    gltf.save(output_path)
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
+def main():
+    parser = argparse.ArgumentParser(description="Transplant PSHuman face into UniRig GLB")
+    parser.add_argument("--body",   required=True,  help="Rigged UniRig GLB")
+    parser.add_argument("--face",   required=True,  help="PSHuman mesh (OBJ/GLB/PLY)")
+    parser.add_argument("--output", required=True,  help="Output GLB path")
+    parser.add_argument("--head-bones", default="head,Head,skull,Skull",
+                        help="Comma-separated bone name substrings for head detection")
+    parser.add_argument("--weight-threshold", type=float, default=0.35,
+                        help="Minimum head-bone weight sum to classify a vert as face")
+    parser.add_argument("--retract", type=float, default=0.004,
+                        help="Metres to retract UniRig face verts inward (default 0.004)")
+    parser.add_argument("--knn", type=int, default=5,
+                        help="K nearest neighbours for weight transfer")
+    args = parser.parse_args()
+    subs = tuple(s.strip() for s in args.head_bones.split(","))
+    transplant_face(
+        body_glb_path      = args.body,
+        pshuman_mesh_path  = args.face,
+        output_path        = args.output,
+        head_bone_substrings = subs,
+        weight_threshold   = args.weight_threshold,
+        retract_amount     = args.retract,
+        knn                = args.knn,
+    )
+if __name__ == "__main__":
+    main()

pipeline/head_replace.py ADDED Viewed

	@@ -0,0 +1,762 @@

+"""
+head_replace.py — Replace TripoSG head with DECA-reconstructed head at mesh level.
+Requires: trimesh, numpy, scipy, cv2, torch (+ face-alignment via DECA deps)
+Optional: pymeshlab (for mesh clean-up)
+Usage (standalone):
+    python head_replace.py --body /tmp/triposg_textured.glb \
+                           --face /path/to/face.jpg \
+                           --out  /tmp/head_replaced.glb
+Returns combined GLB with DECA head geometry + TripoSG body.
+"""
+import os, sys, argparse, warnings
+warnings.filterwarnings('ignore')
+import numpy as np
+import cv2
+from PIL import Image
+# ──────────────────────────────────────────────────────────────────
+# Patch DECA before importing it to avoid pytorch3d dependency
+# ──────────────────────────────────────────────────────────────────
+DECA_ROOT = '/root/DECA'
+sys.path.insert(0, DECA_ROOT)
+# Stub out the rasterizer so DECA doesn't try to import pytorch3d
+import importlib, types
+_fake_renderer = types.ModuleType('decalib.utils.renderer')
+_fake_renderer.set_rasterizer = lambda t='pytorch3d': None
+class _FakeRender:
+    """No-op renderer — we only need the mesh, not rendered images."""
+    def __init__(self, *a, **kw): pass
+    def to(self, *a, **kw): return self
+    def __call__(self, *a, **kw): return {'images': None, 'alpha_images': None,
+                                          'normal_images': None, 'grid': None,
+                                          'transformed_normals': None, 'normals': None}
+    def render_shape(self, *a, **kw): return None, None, None, None
+    def world2uv(self, *a, **kw): return None
+    def add_SHlight(self, *a, **kw): return None
+_fake_renderer.SRenderY = _FakeRender
+sys.modules['decalib.utils.renderer'] = _fake_renderer
+# Patch deca.py: make _setup_renderer a no-op when renderer not available
+from decalib import deca as _deca_mod
+_orig_setup = _deca_mod.DECA._setup_renderer
+def _patched_setup(self, model_cfg):
+    try:
+        _orig_setup(self, model_cfg)
+    except Exception as e:
+        print(f'[head_replace] Renderer disabled ({e})')
+        self.render = _FakeRender()
+        # Still load mask / displacement data we need for UV baking
+        from skimage.io import imread
+        import torch, torch.nn.functional as F
+        try:
+            mask = imread(model_cfg.face_eye_mask_path).astype(np.float32) / 255.
+            mask = torch.from_numpy(mask[:, :, 0])[None, None, :, :].contiguous()
+            self.uv_face_eye_mask = F.interpolate(mask, [model_cfg.uv_size, model_cfg.uv_size])
+            mask2 = imread(model_cfg.face_mask_path).astype(np.float32) / 255.
+            mask2 = torch.from_numpy(mask2[:, :, 0])[None, None, :, :].contiguous()
+            self.uv_face_mask = F.interpolate(mask2, [model_cfg.uv_size, model_cfg.uv_size])
+        except Exception:
+            pass
+        try:
+            fixed_dis = np.load(model_cfg.fixed_displacement_path)
+            self.fixed_uv_dis = torch.tensor(fixed_dis).float()
+        except Exception:
+            pass
+        try:
+            mean_tex_np = imread(model_cfg.mean_tex_path).astype(np.float32) / 255.
+            mean_tex = torch.from_numpy(mean_tex_np.transpose(2, 0, 1))[None]
+            self.mean_texture = F.interpolate(mean_tex, [model_cfg.uv_size, model_cfg.uv_size])
+        except Exception:
+            pass
+        try:
+            self.dense_template = np.load(model_cfg.dense_template_path,
+                                          allow_pickle=True, encoding='latin1').item()
+        except Exception:
+            pass
+_deca_mod.DECA._setup_renderer = _patched_setup
+# ──────────────────────────────────────────────────────────────────
+# FLAME mesh: parse head_template.obj for UV map
+# ──────────────────────────────────────────────────────────────────
+def _load_flame_template(obj_path=os.path.join(DECA_ROOT, 'data', 'head_template.obj')):
+    """Return (verts, faces, uv_verts, uv_faces) from head_template.obj."""
+    verts, uv_verts = [], []
+    faces_v, faces_uv = [], []
+    for line in open(obj_path):
+        t = line.split()
+        if not t:
+            continue
+        if t[0] == 'v':
+            verts.append([float(t[1]), float(t[2]), float(t[3])])
+        elif t[0] == 'vt':
+            uv_verts.append([float(t[1]), float(t[2])])
+        elif t[0] == 'f':
+            vi, uvi = [], []
+            for tok in t[1:]:
+                parts = tok.split('/')
+                vi.append(int(parts[0]) - 1)
+                uvi.append(int(parts[1]) - 1 if len(parts) > 1 and parts[1] else 0)
+            if len(vi) == 3:
+                faces_v.append(vi)
+                faces_uv.append(uvi)
+    return (np.array(verts, dtype=np.float32),
+            np.array(faces_v, dtype=np.int32),
+            np.array(uv_verts, dtype=np.float32),
+            np.array(faces_uv, dtype=np.int32))
+# ──────────────────────────────────────────────────────────────────
+# UV texture baking (software rasteriser, no pytorch3d needed)
+# ──────────────────────────────────────────────────────────────────
+def _bake_uv_texture(verts3d, faces_v, uv_verts, faces_uv, cam, face_img_bgr, tex_size=256):
+    """
+    Project face_img_bgr onto the FLAME UV map using orthographic camera.
+    verts3d : (N,3) FLAME vertices in world space
+    cam     : (3,) = [scale, tx, ty] orthographic camera
+    Returns : (tex_size, tex_size, 3) uint8 texture (BGR)
+    """
+    H, W = face_img_bgr.shape[:2]
+    scale, tx, ty = float(cam[0]), float(cam[1]), float(cam[2])
+    # Orthographic project: DECA formula = (vert_2D + [tx,ty]) * scale, then flip y
+    proj = np.zeros((len(verts3d), 2), dtype=np.float32)
+    proj[:, 0] = (verts3d[:, 0] + tx) * scale
+    proj[:, 1] = -((verts3d[:, 1] + ty) * scale)  # y-flip matches DECA convention
+    # Map to pixel coords: image spans proj ∈ [-1,1] → pixel [0, WH]
+    img_pts = (proj + 1.0) * 0.5 * np.array([W, H], dtype=np.float32)  # (N, 2)
+    # UV pixel coords
+    uv_px = uv_verts * tex_size  # (K, 2)
+    # Output buffers
+    tex_acc = np.zeros((tex_size, tex_size, 3), dtype=np.float64)
+    tex_cnt = np.zeros((tex_size, tex_size), dtype=np.float64)
+    z_buf   = np.full((tex_size, tex_size), -1e9, dtype=np.float64)
+    # Vectorised rasteriser in UV space:
+    # For each face, scatter samples from img_pts into uv_px coords.
+    # Use scipy.interpolate.griddata as a fast splat.
+    from scipy.interpolate import griddata
+    # Front-facing mask (z > threshold) — only bake visible faces
+    z_face = verts3d[faces_v, 2].mean(axis=1)           # (M,) mean z per face
+    front_mask = z_face >= -0.02                          # keep front and side faces
+    # For each face corner, record (uv_px, img_pts) sample
+    corners_uv  = uv_px[faces_uv[front_mask]]            # (K, 3, 2)
+    corners_img = img_pts[faces_v[front_mask]]            # (K, 3, 2)
+    # Flatten to (K*3, 2)
+    src_uv  = corners_uv.reshape(-1, 2)                  # UV pixel destination
+    src_img = corners_img.reshape(-1, 2)                  # image pixel source
+    # Remove out-of-bounds image samples
+    valid = ((src_img[:, 0] >= 0) & (src_img[:, 0] < W) &
+             (src_img[:, 1] >= 0) & (src_img[:, 1] < H))
+    src_uv  = src_uv[valid]
+    src_img = src_img[valid]
+    # Sample face image at src_img positions
+    ix = np.clip(src_img[:, 0].astype(int), 0, W - 1)
+    iy = np.clip(src_img[:, 1].astype(int), 0, H - 1)
+    colours = face_img_bgr[iy, ix].astype(np.float32)    # (P, 3)
+    # Clip UV destinations to texture bounds
+    uv_dest = np.clip(src_uv, 0, tex_size - 1 - 1e-6).astype(np.float32)
+    # Build query grid for griddata interpolation
+    grid_u, grid_v = np.meshgrid(np.arange(tex_size), np.arange(tex_size))
+    grid_pts = np.column_stack([grid_u.ravel(), grid_v.ravel()])
+    # Interpolate each colour channel
+    tex_baked = np.zeros((tex_size * tex_size, 3), dtype=np.float32)
+    for ch in range(3):
+        ch_vals = griddata(uv_dest, colours[:, ch], grid_pts,
+                           method='linear', fill_value=np.nan)
+        tex_baked[:, ch] = ch_vals
+    tex_baked = tex_baked.reshape(tex_size, tex_size, 3)
+    face_baked_mask = ~np.isnan(tex_baked[:, :, 0])
+    # Base texture: mean_texture (skin tone fallback for unsampled regions)
+    mean_tex_path = os.path.join(DECA_ROOT, 'data', 'mean_texture.jpg')
+    if os.path.exists(mean_tex_path):
+        mt = cv2.resize(cv2.imread(mean_tex_path), (tex_size, tex_size)).astype(np.float32)
+    else:
+        mt = np.full((tex_size, tex_size, 3), 180.0, dtype=np.float32)
+    # Blend: baked face over mean texture
+    result = mt.copy()
+    result[face_baked_mask] = np.clip(tex_baked[face_baked_mask], 0, 255)
+    return result.astype(np.uint8)
+# ──────────────────────────────────────────────────────────────────
+# DECA inference
+# ──────────────────────────────────────────────────────────────────
+def run_deca(face_img_path, device='cuda'):
+    """
+    Run DECA on face_img_path.
+    Returns (verts_np, cam_np, faces_v, uv_verts, faces_uv, tex_img_bgr)
+    """
+    import torch
+    from decalib.deca import DECA
+    from decalib.utils import config as cfg_module
+    from decalib.datasets import datasets
+    cfg = cfg_module.get_cfg_defaults()
+    cfg.model.use_tex = False
+    print('[DECA] Loading model...')
+    deca = DECA(config=cfg, device=device)
+    deca.eval()
+    print('[DECA] Preprocessing image...')
+    testdata = datasets.TestData(face_img_path)
+    img_tensor = testdata[0]['image'].to(device)[None, ...]
+    print('[DECA] Encoding...')
+    with torch.no_grad():
+        codedict = deca.encode(img_tensor, use_detail=False)
+        verts, _, _ = deca.flame(
+            shape_params=codedict['shape'],
+            expression_params=codedict['exp'],
+            pose_params=codedict['pose']
+        )
+    verts_np = verts[0].cpu().numpy()  # (5023, 3)
+    cam_np   = codedict['cam'][0].cpu().numpy()  # (3,)
+    print(f'[DECA] Mesh: {verts_np.shape}, cam={cam_np}')
+    # Load FLAME UV map
+    _, faces_v, uv_verts, faces_uv = _load_flame_template()
+    # Get face image for texture baking (use the cropped/aligned 224x224)
+    img_np = (img_tensor[0].cpu().numpy().transpose(1, 2, 0) * 255).astype(np.uint8)
+    img_bgr = cv2.cvtColor(img_np, cv2.COLOR_RGB2BGR)
+    print('[DECA] Baking UV texture...')
+    tex_bgr = _bake_uv_texture(verts_np, faces_v, uv_verts, faces_uv, cam_np, img_bgr, tex_size=256)
+    return verts_np, cam_np, faces_v, uv_verts, faces_uv, tex_bgr
+# ──────────────────────────────────────────────────────────────────
+# Mesh helpers
+# ──────────────────────────────────────────────────────────────────
+def _find_neck_height(mesh):
+    """
+    Find the best neck cut height in a body mesh.
+    Strategy: in the top 40% of the mesh, find the local minimum of
+    cross-sectional area (the neck is narrower than the head).
+    Returns the y-value of the cut plane.
+    """
+    verts = mesh.vertices
+    y_min, y_max = verts[:, 1].min(), verts[:, 1].max()
+    y_range = y_max - y_min
+    # Scan [80%, 87%] to find the neck-base narrowing below the face.
+    # The range [83%, 91%] was picking the crown taper instead of the neck.
+    y_start = y_min + y_range * 0.80
+    y_end   = y_min + y_range * 0.87
+    steps   = 20
+    ys      = np.linspace(y_start, y_end, steps)
+    band    = y_range * 0.015
+    r10_vals = []
+    for y in ys:
+        pts = verts[(verts[:, 1] >= y - band) & (verts[:, 1] <= y + band)]
+        if len(pts) < 6:
+            r10_vals.append(1.0); continue
+        xz = pts[:, [0, 2]]
+        cx, cz = xz.mean(0)
+        radii = np.sqrt((xz[:, 0] - cx)**2 + (xz[:, 1] - cz)**2)
+        r10_vals.append(float(np.percentile(radii, 10)))
+    from scipy.ndimage import uniform_filter1d
+    r10 = uniform_filter1d(np.array(r10_vals), size=3)
+    neck_idx = int(np.argmin(r10[2:-2])) + 2
+    neck_y = float(ys[neck_idx])
+    frac = (neck_y - y_min) / y_range
+    print(f'[neck] Cut height: {neck_y:.4f} (y_range {y_min:.3f}–{y_max:.3f}, frac={frac:.2f})')
+    return neck_y
+def _weld_mesh(mesh):
+    """
+    Merge duplicate vertices (UV-split mesh → geometric mesh).
+    Returns a new trimesh with welded vertices.
+    """
+    import trimesh
+    from scipy.spatial import cKDTree
+    verts = mesh.vertices
+    tree = cKDTree(verts)
+    # Build mapping: each vertex → canonical representative
+    N = len(verts)
+    mapping = np.arange(N, dtype=np.int64)
+    pairs = tree.query_pairs(r=1e-5)
+    for a, b in pairs:
+        root_a = int(mapping[a])
+        root_b = int(mapping[b])
+        while mapping[root_a] != root_a:
+            root_a = int(mapping[root_a])
+        while mapping[root_b] != root_b:
+            root_b = int(mapping[root_b])
+        if root_a != root_b:
+            mapping[root_b] = root_a
+    # Flatten chains
+    for i in range(N):
+        root = int(mapping[i])
+        while mapping[root] != root:
+            root = int(mapping[root])
+        mapping[i] = root
+    # Compact the mapping
+    unique_ids = np.unique(mapping)
+    compact = np.full(N, -1, dtype=np.int64)
+    compact[unique_ids] = np.arange(len(unique_ids))
+    new_faces = compact[mapping[mesh.faces]]
+    new_verts = verts[unique_ids]
+    return trimesh.Trimesh(vertices=new_verts, faces=new_faces, process=False)
+def _cut_mesh_below(mesh, y_cut):
+    """Keep only faces where all vertices are at or below y_cut. Preserves UV/texture."""
+    import trimesh
+    from trimesh.visual.texture import TextureVisuals
+    v_mask = mesh.vertices[:, 1] <= y_cut
+    f_keep = np.all(v_mask[mesh.faces], axis=1)
+    faces_kept = mesh.faces[f_keep]
+    used_verts = np.unique(faces_kept)
+    old_to_new = np.full(len(mesh.vertices), -1, dtype=np.int64)
+    old_to_new[used_verts] = np.arange(len(used_verts))
+    new_faces = old_to_new[faces_kept]
+    new_verts = mesh.vertices[used_verts]
+    new_mesh = trimesh.Trimesh(vertices=new_verts, faces=new_faces, process=False)
+    # Preserve UV + texture if present
+    if hasattr(mesh.visual, 'uv') and mesh.visual.uv is not None:
+        new_mesh.visual = TextureVisuals(
+            uv=mesh.visual.uv[used_verts],
+            material=mesh.visual.material)
+    return new_mesh
+def _extract_neck_ring_geometric(mesh, neck_y, n_pts=64, band_frac=0.02):
+    """
+    Extract a neck ring using topological boundary edges near neck_y.
+    Falls back to angle-sorted vertices if topology is non-manifold.
+    Works on welded (geometric) meshes.
+    """
+    verts = mesh.vertices
+    y_range = verts[:, 1].max() - verts[:, 1].min()
+    band = y_range * band_frac
+    # --- Try topological boundary near neck_y first ---
+    edges = np.sort(mesh.edges, axis=1)
+    u, c2 = np.unique(edges, axis=0, return_counts=True)
+    be = u[c2 == 1]   # boundary edges
+    # Keep boundary edges where BOTH endpoints are near neck_y
+    v_near = np.abs(verts[:, 1] - neck_y) <= band * 2
+    neck_be = be[v_near[be[:, 0]] & v_near[be[:, 1]]]
+    if len(neck_be) >= 8:
+        # Build adjacency and walk loop
+        adj = {}
+        for e in neck_be:
+            adj.setdefault(int(e[0]), []).append(int(e[1]))
+            adj.setdefault(int(e[1]), []).append(int(e[0]))
+        # Find the largest connected loop
+        visited = set()
+        loops = []
+        for start in adj:
+            if start in visited: continue
+            loop = [start]; visited.add(start); prev = -1; cur = start
+            for _ in range(len(neck_be) + 1):
+                nbrs = [v for v in adj.get(cur, []) if v != prev]
+                if not nbrs: break
+                nxt = nbrs[0]
+                if nxt == start: break
+                if nxt in visited: break
+                visited.add(nxt); prev = cur; cur = nxt; loop.append(cur)
+            loops.append(loop)
+        if loops:
+            best = max(loops, key=len)
+            if len(best) >= 8:
+                ring_pts = verts[best]
+                # Snap all ring points to neck_y (smooth the cut plane)
+                ring_pts = ring_pts.copy()
+                ring_pts[:, 1] = neck_y
+                return _resample_loop(ring_pts, n_pts)
+    # --- Fallback: use inner-cluster (neck column) vertices in the band ---
+    mask = (verts[:, 1] >= neck_y - band) & (verts[:, 1] <= neck_y + band)
+    pts = verts[mask]
+    if len(pts) < 8:
+        raise ValueError(f'Too few vertices near neck_y={neck_y:.4f}: {len(pts)}')
+    # Keep only inner-ring vertices (below 35th percentile radius from centroid)
+    # This excludes the outer face/head surface and keeps only the neck column
+    xz = pts[:, [0, 2]]
+    cx, cz = xz.mean(0)
+    radii = np.sqrt((xz[:, 0] - cx)**2 + (xz[:, 1] - cz)**2)
+    thresh = np.percentile(radii, 35)
+    inner_mask = radii <= thresh
+    if inner_mask.sum() >= 8:
+        pts = pts[inner_mask]
+        # Recompute centroid on inner pts
+        cx, cz = pts[:, [0, 2]].mean(0)
+    # Sort by angle in XZ plane
+    angles = np.arctan2(pts[:, 2] - cz, pts[:, 0] - cx)
+    pts_sorted = pts[np.argsort(angles)]
+    pts_sorted = pts_sorted.copy()
+    pts_sorted[:, 1] = neck_y  # snap to cut plane
+    return _resample_loop(pts_sorted, n_pts)
+def _extract_boundary_loop(mesh):
+    """
+    Extract the boundary edge loop (ordered) from a welded mesh.
+    Returns (N, 3) ordered vertex positions.
+    """
+    # Find boundary edges (edges used by exactly one face)
+    edges = np.sort(mesh.edges, axis=1)
+    unique, counts = np.unique(edges, axis=0, return_counts=True)
+    boundary_edges = unique[counts == 1]
+    if len(boundary_edges) == 0:
+        raise ValueError('No boundary edges found — mesh may be closed')
+    # Build adjacency for boundary edges
+    adj = {}
+    for e in boundary_edges:
+        adj.setdefault(int(e[0]), []).append(int(e[1]))
+        adj.setdefault(int(e[1]), []).append(int(e[0]))
+    # Walk the longest connected loop
+    # Find all loops
+    visited = set()
+    loops = []
+    for start_v in adj:
+        if start_v in visited:
+            continue
+        loop = [start_v]
+        visited.add(start_v)
+        prev = -1
+        cur = start_v
+        for _ in range(len(boundary_edges) + 1):
+            nbrs = [v for v in adj.get(cur, []) if v != prev]
+            if not nbrs:
+                break
+            nxt = nbrs[0]
+            if nxt == start_v:
+                break
+            if nxt in visited:
+                break
+            visited.add(nxt)
+            prev = cur
+            cur = nxt
+            loop.append(cur)
+        loops.append(loop)
+    # Use the longest loop
+    best = max(loops, key=len)
+    return mesh.vertices[best]
+def _resample_loop(loop_pts, N):
+    """Resample an ordered set of 3D points to exactly N evenly-spaced points."""
+    from scipy.interpolate import interp1d
+    # Arc-length parameterisation
+    diffs = np.diff(loop_pts, axis=0, prepend=loop_pts[-1:])
+    seg_lens = np.linalg.norm(diffs, axis=1)
+    t = np.cumsum(seg_lens)
+    t = np.insert(t, 0, 0)
+    t /= t[-1]
+    # Close the loop
+    t[-1] = 1.0
+    loop_closed = np.vstack([loop_pts, loop_pts[0]])
+    interp = interp1d(t, loop_closed, axis=0)
+    t_new = np.linspace(0, 1, N, endpoint=False)
+    return interp(t_new)
+def _bridge_loops(loop_a, loop_b):
+    """
+    Create a triangle strip bridging two ordered loops of equal length N.
+    loop_a, loop_b: (N, 3) vertex positions
+    Returns (verts, faces) — just the bridge strip as a trimesh-ready array.
+    """
+    N = len(loop_a)
+    verts = np.vstack([loop_a, loop_b])  # (2N, 3) — a:0..N-1, b:N..2N-1
+    faces = []
+    for i in range(N):
+        j = (i + 1) % N
+        ai, aj = i, j
+        bi, bj = i + N, j + N
+        faces.append([ai, aj, bi])
+        faces.append([aj, bj, bi])
+    return verts, np.array(faces, dtype=np.int32)
+# ──────────────────────────────────────────────────────────────────
+# DECA head → trimesh
+# ──────────────────────────────────────────────────────────────────
+def deca_to_trimesh(verts_np, faces_v, uv_verts, faces_uv, tex_bgr):
+    """
+    Assemble a trimesh.Trimesh from DECA outputs with UV texture.
+    Uses per-vertex UV (averaged over face corners sharing each vertex).
+    """
+    import trimesh
+    from trimesh.visual.texture import TextureVisuals
+    from trimesh.visual.material import PBRMaterial
+    # Average face-corner UVs per vertex
+    N = len(verts_np)
+    uv_sum = np.zeros((N, 2), dtype=np.float64)
+    uv_cnt = np.zeros(N, dtype=np.int32)
+    for fi in range(len(faces_v)):
+        for ci in range(3):
+            vi = faces_v[fi, ci]
+            uvi = faces_uv[fi, ci]
+            uv_sum[vi] += uv_verts[uvi]
+            uv_cnt[vi] += 1
+    uv_cnt = np.maximum(uv_cnt, 1)
+    uv_per_vert = (uv_sum / uv_cnt[:, None]).astype(np.float32)
+    mesh = trimesh.Trimesh(vertices=verts_np, faces=faces_v, process=False)
+    tex_rgb = cv2.cvtColor(tex_bgr, cv2.COLOR_BGR2RGB)
+    tex_pil = Image.fromarray(tex_rgb)
+    try:
+        mat = PBRMaterial(baseColorTexture=tex_pil)
+        mesh.visual = TextureVisuals(uv=uv_per_vert, material=mat)
+        print(f'[deca_to_trimesh] UV attached: {uv_per_vert.shape}, tex={tex_rgb.shape}')
+    except Exception as e:
+        print(f'[deca_to_trimesh] UV attach failed ({e}) — using vertex colours')
+        mesh.visual.vertex_colors = np.tile([200, 175, 155, 255], (len(verts_np), 1))
+    return mesh
+# ──────────────────────────────────────────────────────────────────
+# Main head-replacement function
+# ──────────────────────────────────────────────────────────────────
+def replace_head(body_glb: str, face_img_path: str, out_glb: str,
+                 device: str = 'cuda', bridge_n: int = 64):
+    """
+    Main entry point.
+    body_glb       : path to TripoSG textured GLB
+    face_img_path  : path to reference face image
+    out_glb        : output path for combined GLB
+    bridge_n       : number of vertices in the stitching ring
+    """
+    import trimesh
+    import torch
+    # ── 1. Load body GLB ──────────────────────────────────────────
+    print('[replace_head] Loading body GLB...')
+    scene = trimesh.load(body_glb)
+    if isinstance(scene, trimesh.Scene):
+        body_mesh = trimesh.util.concatenate(
+            [g for g in scene.geometry.values() if isinstance(g, trimesh.Trimesh)]
+        )
+    else:
+        body_mesh = scene
+    print(f'  Body: {len(body_mesh.vertices)} verts, {len(body_mesh.faces)} faces')
+    # ── 1b. Weld body mesh (UV-split → geometric) ─────────────────
+    print('[replace_head] Welding mesh vertices...')
+    body_welded = _weld_mesh(body_mesh)
+    print(f'  Welded: {len(body_welded.vertices)} verts (was {len(body_mesh.vertices)})')
+    # ── 2. Find neck cut height ───────────────────────────────────
+    neck_y = _find_neck_height(body_welded)
+    # ── 3. Cut body at neck ───────────────────────────────────────
+    print('[replace_head] Cutting body at neck...')
+    # Work on welded mesh for topology; keep original mesh for geometry export
+    body_lower_welded = _cut_mesh_below(body_welded, neck_y)
+    body_lower = _cut_mesh_below(body_mesh, neck_y)  # keeps original UV/texture
+    print(f'  Body lower: {len(body_lower.vertices)} verts')
+    # Extract neck ring geometrically (robust for non-manifold UV-split meshes)
+    body_neck_loop = _extract_neck_ring_geometric(body_welded, neck_y, n_pts=bridge_n)
+    print(f'  Body neck ring: {len(body_neck_loop)} pts (geometric)')
+    # ── 4. Run DECA ───────────────────────────────────────────────
+    print('[replace_head] Running DECA...')
+    verts_np, cam_np, faces_v, uv_verts, faces_uv, tex_bgr = run_deca(face_img_path, device=device)
+    # ── 5. Align DECA head to body coordinate system ─────────────
+    # TripoSG body is roughly in [-1,1] world space (y-up)
+    # DECA/FLAME space: head centered around origin, scale ≈ 1.5-2.5 units for full head
+    # We need to:
+    #   a) Scale the FLAME head to match body scale
+    #   b) Position the FLAME head so its neck base aligns with body neck ring
+    # Get the bottom of the FLAME head (neck area)
+    # FLAME template: bottom vertices are the neck boundary ring
+    flame_mesh_tmp = __import__('trimesh').Trimesh(vertices=verts_np, faces=faces_v, process=False)
+    try:
+        flame_neck_loop = _extract_boundary_loop(flame_mesh_tmp)
+        print(f'  FLAME neck ring (topology): {len(flame_neck_loop)} verts')
+    except Exception as e:
+        print(f'  FLAME boundary loop failed ({e}), using geometric extraction')
+        # Geometric fallback: bottom 5% of head vertices
+        flame_neck_y = verts_np[:, 1].min() + (verts_np[:, 1].max() - verts_np[:, 1].min()) * 0.08
+        flame_neck_loop = _extract_neck_ring_geometric(flame_mesh_tmp, flame_neck_y, n_pts=bridge_n)
+        print(f'  FLAME neck ring (geometric): {len(flame_neck_loop)} pts')
+    # ── 5b. Compute head position using NECK RING centroid ───────────────
+    # Directly align FLAME neck ring center → body neck ring center in all 3 axes.
+    # This is robust regardless of body pose or tilt.
+    body_neck_center = body_neck_loop.mean(axis=0)
+    # Estimate head height from WELDED mesh crown (more reliable than UV-split mesh)
+    welded_y_max = float(body_welded.vertices[:, 1].max())
+    body_head_height = welded_y_max - neck_y
+    flame_neck_center_unscaled = flame_neck_loop.mean(axis=0)
+    flame_y_min = verts_np[:, 1].min()
+    flame_y_max = verts_np[:, 1].max()
+    flame_head_height = flame_y_max - flame_y_min
+    print(f'  Body neck center: {body_neck_center.round(4)}')
+    print(f'  Body head space: {body_head_height:.4f} (neck_y={neck_y:.4f}, crown_y={welded_y_max:.4f})')
+    print(f'  FLAME head height (unscaled): {flame_head_height:.4f}')
+    print(f'  FLAME neck center (unscaled): {flame_neck_center_unscaled.round(4)}')
+    # Scale FLAME head to match body head height
+    if flame_head_height > 1e-5:
+        head_scale = body_head_height / flame_head_height
+    else:
+        head_scale = 1.0
+    print(f'  Head scale: {head_scale:.4f}')
+    # Translate: FLAME neck ring center → body neck ring center in XZ,
+    # FLAME mesh bottom (flame_y_min) → neck_y in Y.
+    # This ensures the head fills the full space from neck_y to body crown.
+    translate = np.array([
+        body_neck_center[0] - flame_neck_center_unscaled[0] * head_scale,
+        neck_y              - flame_y_min                  * head_scale,
+        body_neck_center[2] - flame_neck_center_unscaled[2] * head_scale,
+    ])
+    print(f'  Translate: {translate.round(4)}')
+    verts_aligned = verts_np * head_scale + translate
+    print(f'  FLAME aligned y={verts_aligned[:,1].min():.4f}→{verts_aligned[:,1].max():.4f}'
+          f'  x={verts_aligned[:,0].min():.4f}→{verts_aligned[:,0].max():.4f}'
+          f'  z={verts_aligned[:,2].min():.4f}→{verts_aligned[:,2].max():.4f}')
+    # Extract FLAME neck loop after alignment (at the cut plane y=neck_y)
+    flame_verts_aligned = verts_aligned
+    flame_mesh_aligned  = __import__('trimesh').Trimesh(
+        vertices=flame_verts_aligned, faces=faces_v, process=False)
+    try:
+        flame_neck_loop_aligned = _extract_boundary_loop(flame_mesh_aligned)
+        print(f'  FLAME neck ring (topology): {len(flame_neck_loop_aligned)} verts')
+    except Exception:
+        flame_neck_y_aligned = flame_verts_aligned[:, 1].min() + (
+            flame_verts_aligned[:, 1].max() - flame_verts_aligned[:, 1].min()) * 0.05
+        flame_neck_loop_aligned = _extract_neck_ring_geometric(
+            flame_mesh_aligned, flame_neck_y_aligned, n_pts=bridge_n)
+        print(f'  FLAME neck ring (geometric): {len(flame_neck_loop_aligned)} pts')
+    flame_neck_r = np.linalg.norm(flame_neck_loop_aligned - flame_neck_loop_aligned.mean(0), axis=1).mean()
+    body_neck_r  = np.linalg.norm(body_neck_loop       - body_neck_loop.mean(0),       axis=1).mean()
+    print(f'  Body neck radius: {body_neck_r:.4f}  FLAME neck radius (scaled): {flame_neck_r:.4f}')
+    # ── 6. Resample both neck loops to bridge_n points ────────────
+    body_loop_r  = _resample_loop(body_neck_loop, bridge_n)
+    flame_loop_r = _resample_loop(flame_neck_loop_aligned, bridge_n)
+    # Ensure loops are oriented consistently (both CW or both CCW)
+    # Compute signed area to check orientation
+    def _loop_orientation(loop):
+        c = loop.mean(0)
+        t = loop - c
+        cross = np.cross(t[:-1], t[1:])
+        return float(np.sum(cross[:, 1]))  # y-component
+    o_body  = _loop_orientation(body_loop_r)
+    o_flame = _loop_orientation(flame_loop_r)
+    if (o_body > 0) != (o_flame > 0):
+        flame_loop_r = flame_loop_r[::-1]
+    # ── 7. Align loop starting points (minimise bridge twist) ─────
+    # Match starting vertex: find flame loop point closest to body loop start
+    dists = np.linalg.norm(flame_loop_r - body_loop_r[0], axis=1)
+    best_offset = int(np.argmin(dists))
+    flame_loop_r = np.roll(flame_loop_r, -best_offset, axis=0)
+    # ── 8. Build bridge strip ─────────────────────────────────────
+    bridge_verts, bridge_faces = _bridge_loops(body_loop_r, flame_loop_r)
+    bridge_mesh = __import__('trimesh').Trimesh(vertices=bridge_verts, faces=bridge_faces, process=False)
+    # ── 9. Combine: body_lower + bridge + FLAME head ──────────────
+    # Build FLAME head mesh with texture
+    head_mesh = deca_to_trimesh(flame_verts_aligned, faces_v, uv_verts, faces_uv, tex_bgr)
+    # Combine all parts
+    combined = __import__('trimesh').util.concatenate([body_lower, bridge_mesh, head_mesh])
+    combined = __import__('trimesh').Trimesh(
+        vertices=combined.vertices,
+        faces=combined.faces,
+        process=False
+    )
+    # Try to copy body texture to combined if available
+    try:
+        if hasattr(body_lower.visual, 'material'):
+            pass  # Keep per-mesh materials — export as scene
+    except Exception:
+        pass
+    # ── 10. Export ────────────────────────────────────────────────
+    print(f'[replace_head] Exporting combined mesh: {len(combined.vertices)} verts...')
+    os.makedirs(os.path.dirname(out_glb) or '.', exist_ok=True)
+    # Export as GLB scene with separate submeshes (preserves textures)
+    try:
+        import trimesh
+        scene_out = trimesh.Scene()
+        scene_out.add_geometry(body_lower, geom_name='body')
+        scene_out.add_geometry(bridge_mesh, geom_name='bridge')
+        scene_out.add_geometry(head_mesh, geom_name='head')
+        scene_out.export(out_glb)
+        print(f'[replace_head] Saved scene GLB: {out_glb}  ({os.path.getsize(out_glb)//1024} KB)')
+    except Exception as e:
+        print(f'[replace_head] Scene export failed ({e}), trying single mesh...')
+        combined.export(out_glb)
+        print(f'[replace_head] Saved GLB: {out_glb}  ({os.path.getsize(out_glb)//1024} KB)')
+    return out_glb
+# ──────────────────────────────────────────────────────────────────
+# CLI
+# ──────────────────────────────────────────────────────────────────
+if __name__ == '__main__':
+    ap = argparse.ArgumentParser()
+    ap.add_argument('--body',   required=True, help='TripoSG body GLB path')
+    ap.add_argument('--face',   required=True, help='Reference face image path')
+    ap.add_argument('--out',    required=True, help='Output GLB path')
+    ap.add_argument('--bridge', type=int, default=64, help='Bridge ring vertex count')
+    ap.add_argument('--cpu',    action='store_true', help='Use CPU instead of CUDA')
+    args = ap.parse_args()
+    device = 'cpu' if args.cpu else ('cuda' if __import__('torch').cuda.is_available() else 'cpu')
+    replace_head(args.body, args.face, args.out, device=device, bridge_n=args.bridge)

pipeline/pshuman_client.py ADDED Viewed

	@@ -0,0 +1,283 @@

+"""
+pshuman_client.py
+=================
+Call PSHuman to generate a high-detail 3D face mesh from a portrait image.
+Two modes:
+- Direct (default when service_url is localhost): runs PSHuman inference.py
+  as a subprocess without going through Gradio HTTP.  Avoids the gradio_client
+  API-info bug that affects the pshuman Gradio env.
+- Remote: uses gradio_client to call a running pshuman_app.py service.
+Usage (standalone)
+------------------
+python -m pipeline.pshuman_client \\
+    --image /path/to/portrait.png \\
+    --output /tmp/pshuman_face.obj \\
+    [--url http://remote-host:7862]   # omit for direct/local mode
+Requires: gradio-client (remote mode only)
+"""
+from __future__ import annotations
+import argparse
+import glob
+import os
+import shutil
+import subprocess
+import time
+from pathlib import Path
+# Default: assume running on the same instance (local)
+_DEFAULT_URL = os.environ.get("PSHUMAN_URL", "http://localhost:7862")
+# ── Paths (on the Vast instance) ──────────────────────────────────────────────
+PSHUMAN_DIR  = "/root/PSHuman"
+CONDA_PYTHON = "/root/miniconda/envs/pshuman/bin/python"
+CONFIG       = f"{PSHUMAN_DIR}/configs/inference-768-6view.yaml"
+HF_MODEL_DIR = f"{PSHUMAN_DIR}/checkpoints/PSHuman_Unclip_768_6views"
+HF_MODEL_HUB = "pengHTYX/PSHuman_Unclip_768_6views"
+def _run_pshuman_direct(image_path: str, work_dir: str) -> str:
+    """
+    Run PSHuman inference.py directly as a subprocess.
+    Returns path to the colored OBJ mesh.
+    """
+    img_dir = os.path.join(work_dir, "input")
+    out_dir = os.path.join(work_dir, "out")
+    os.makedirs(img_dir, exist_ok=True)
+    os.makedirs(out_dir, exist_ok=True)
+    scene = "face"
+    dst = os.path.join(img_dir, f"{scene}.png")
+    shutil.copy(image_path, dst)
+    hf_model = HF_MODEL_DIR if Path(HF_MODEL_DIR).exists() else HF_MODEL_HUB
+    cmd = [
+        CONDA_PYTHON, f"{PSHUMAN_DIR}/inference.py",
+        "--config", CONFIG,
+        f"pretrained_model_name_or_path={hf_model}",
+        f"validation_dataset.root_dir={img_dir}",
+        f"save_dir={out_dir}",
+        "validation_dataset.crop_size=740",
+        "with_smpl=false",
+        "num_views=7",
+        "save_mode=rgb",
+        "seed=42",
+    ]
+    print(f"[pshuman] Running direct inference: {' '.join(cmd[:4])} ...")
+    t0 = time.time()
+    # Set CUDA_HOME + extra include dirs so nvdiffrast/torch JIT can compile.
+    # On Vast.ai, triposg conda env ships nvcc at bin/nvcc and CUDA headers
+    # scattered across site-packages/nvidia/{pkg}/include/ directories.
+    env = os.environ.copy()
+    if "CUDA_HOME" not in env:
+        _triposg   = "/root/miniconda/envs/triposg"
+        _targets   = os.path.join(_triposg, "targets", "x86_64-linux")
+        _nvcc_bin  = os.path.join(_triposg, "bin")
+        _cuda_home = _targets  # has include/cuda_runtime_api.h
+        _nvvm_bin  = os.path.join(_triposg, "nvvm", "bin")   # contains cicc
+        _nvcc_real = os.path.join(_targets, "bin")            # contains nvcc (real one)
+        if (os.path.exists(os.path.join(_cuda_home, "include", "cuda_runtime_api.h"))
+                and (os.path.exists(os.path.join(_nvcc_bin, "nvcc"))
+                     or os.path.exists(os.path.join(_nvcc_real, "nvcc")))):
+            env["CUDA_HOME"] = _cuda_home
+            # Build PATH: nvvm/bin (cicc) + targets/.../bin (nvcc real) + conda bin (nvcc wrapper)
+            path_parts = []
+            if os.path.isdir(_nvvm_bin):
+                path_parts.append(_nvvm_bin)
+            if os.path.isdir(_nvcc_real):
+                path_parts.append(_nvcc_real)
+            path_parts.append(_nvcc_bin)
+            env["PATH"] = ":".join(path_parts) + ":" + env.get("PATH", "")
+            # Collect all nvidia sub-package include dirs (cusparse, cublas, etc.)
+            _nvidia_site = os.path.join(_triposg, "lib", "python3.10",
+                                        "site-packages", "nvidia")
+            _extra_incs = []
+            if os.path.isdir(_nvidia_site):
+                import glob as _glob
+                for _inc in _glob.glob(os.path.join(_nvidia_site, "*/include")):
+                    if os.path.isdir(_inc):
+                        _extra_incs.append(_inc)
+            if _extra_incs:
+                _sep = ":"
+                _existing = env.get("CPATH", "")
+                env["CPATH"] = _sep.join(_extra_incs) + (_sep + _existing if _existing else "")
+            print(f"[pshuman] CUDA_HOME={_cuda_home}, {len(_extra_incs)} nvidia include dirs added")
+    proc = subprocess.run(
+        cmd, cwd=PSHUMAN_DIR,
+        capture_output=False,
+        text=True,
+        timeout=600,
+        env=env,
+    )
+    elapsed = time.time() - t0
+    print(f"[pshuman] Inference done in {elapsed:.1f}s (exit={proc.returncode})")
+    if proc.returncode != 0:
+        raise RuntimeError(f"PSHuman inference failed (exit {proc.returncode})")
+    # Locate output OBJ — PSHuman may save relative to its CWD (/root/PSHuman/out/)
+    # rather than to the specified save_dir, so check both locations.
+    cwd_out_dir = os.path.join(PSHUMAN_DIR, "out", scene)
+    patterns = [
+        f"{out_dir}/{scene}/result_clr_scale4_{scene}.obj",
+        f"{out_dir}/{scene}/result_clr_scale*_{scene}.obj",
+        f"{out_dir}/**/*.obj",
+        f"{cwd_out_dir}/result_clr_scale*_{scene}.obj",
+        f"{cwd_out_dir}/*.obj",
+        f"{PSHUMAN_DIR}/out/**/*.obj",
+    ]
+    obj_path = None
+    for pat in patterns:
+        hits = sorted(glob.glob(pat, recursive=True))
+        if hits:
+            colored = [h for h in hits if "clr" in h]
+            obj_path = (colored or hits)[-1]
+            break
+    if not obj_path:
+        all_files = list(Path(out_dir).rglob("*"))
+        objs = [str(f) for f in all_files if f.suffix in (".obj", ".ply", ".glb")]
+        if objs:
+            obj_path = objs[-1]
+        if not obj_path and Path(cwd_out_dir).exists():
+            for f in Path(cwd_out_dir).rglob("*.obj"):
+                obj_path = str(f)
+                break
+        if not obj_path:
+            raise FileNotFoundError(
+                f"No mesh output found in {out_dir}. "
+                f"Files: {[str(f) for f in all_files[:20]]}"
+            )
+    print(f"[pshuman] Output mesh: {obj_path}")
+    return obj_path
+def generate_pshuman_mesh(
+    image_path: str,
+    output_path: str,
+    service_url: str = _DEFAULT_URL,
+    timeout: float = 600.0,
+) -> str:
+    """
+    Generate a PSHuman face mesh and save it to *output_path*.
+    When service_url points to localhost, PSHuman inference.py is run directly
+    (no Gradio HTTP, avoids gradio_client API-info bug).
+    For remote URLs, gradio_client is used.
+    Parameters
+    ----------
+    image_path   : local PNG/JPG path of the portrait
+    output_path  : where to save the downloaded OBJ
+    service_url  : base URL of pshuman_app.py, or "direct" to skip HTTP
+    timeout      : seconds to wait for inference (used in remote mode)
+    Returns
+    -------
+    output_path  (convenience)
+    """
+    import tempfile
+    output_path = str(output_path)
+    os.makedirs(Path(output_path).parent, exist_ok=True)
+    is_local = (
+        "localhost" in service_url
+        or "127.0.0.1" in service_url
+        or service_url.strip().lower() == "direct"
+        or not service_url.strip()
+    )
+    if is_local:
+        # ── Direct mode: run subprocess ───────────────────────────────────────
+        print(f"[pshuman] Direct mode (no HTTP) — running inference on {image_path}")
+        work_dir = tempfile.mkdtemp(prefix="pshuman_direct_")
+        obj_tmp  = _run_pshuman_direct(image_path, work_dir)
+    else:
+        # ── Remote mode: call Gradio service ──────────────────────────────────
+        try:
+            from gradio_client import Client
+        except ImportError:
+            raise ImportError("pip install gradio-client")
+        print(f"[pshuman] Connecting to {service_url}")
+        client = Client(service_url)
+        print(f"[pshuman] Submitting: {image_path}")
+        result = client.predict(
+            image=image_path,
+            api_name="/gradio_generate_face",
+        )
+        if isinstance(result, (list, tuple)):
+            obj_tmp = result[0]
+            status  = result[1] if len(result) > 1 else "ok"
+        elif isinstance(result, dict):
+            obj_tmp = result.get("obj_path") or result.get("value")
+            status  = result.get("status", "ok")
+        else:
+            obj_tmp = result
+            status  = "ok"
+        if not obj_tmp or "Error" in str(status):
+            raise RuntimeError(f"PSHuman service error: {status}")
+        if isinstance(obj_tmp, dict):
+            obj_tmp = obj_tmp.get("path") or obj_tmp.get("name") or str(obj_tmp)
+        work_dir = str(Path(str(obj_tmp)).parent)
+    # ── Copy OBJ + companions to output location ───────────────────────────
+    shutil.copy(str(obj_tmp), output_path)
+    print(f"[pshuman] Saved OBJ -> {output_path}")
+    src_dir = Path(str(obj_tmp)).parent
+    out_dir = Path(output_path).parent
+    for ext in ("*.mtl", "*.png", "*.jpg"):
+        for f in src_dir.glob(ext):
+            dest = out_dir / f.name
+            if not dest.exists():
+                shutil.copy(str(f), str(dest))
+    return output_path
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate PSHuman face mesh from portrait image"
+    )
+    parser.add_argument("--image",   required=True, help="Portrait image path")
+    parser.add_argument("--output",  required=True, help="Output OBJ path")
+    parser.add_argument(
+        "--url", default=_DEFAULT_URL,
+        help="PSHuman service URL, or 'direct' to run inference locally "
+             "(default: http://localhost:7862 → auto-selects direct mode)",
+    )
+    parser.add_argument("--timeout", type=float, default=600.0)
+    args = parser.parse_args()
+    generate_pshuman_mesh(
+        image_path  = args.image,
+        output_path = args.output,
+        service_url = args.url,
+        timeout     = args.timeout,
+    )
+if __name__ == "__main__":
+    main()

pipeline/render_glb.py ADDED Viewed

	@@ -0,0 +1,25 @@

+import sys, cv2
+sys.path.insert(0, '/root/MV-Adapter')
+import numpy as np, torch
+from mvadapter.utils.mesh_utils import NVDiffRastContextWrapper, load_mesh, get_orthogonal_camera, render
+glb = sys.argv[1]
+out = sys.argv[2]
+device = 'cuda'
+ctx = NVDiffRastContextWrapper(device=device, context_type='cuda')
+mesh = load_mesh(glb, rescale=True, device=device)
+views = [('front',-90),('right',-180),('back',-270),('left',0)]
+imgs = []
+for name, az in views:
+    cam = get_orthogonal_camera(elevation_deg=[0], distance=[1.8],
+        left=-0.55, right=0.55, bottom=-0.55, top=0.55,
+        azimuth_deg=[az], device=device)
+    r = render(ctx, mesh, cam, height=512, width=384,
+               render_attr=True, render_depth=False, render_normal=False, attr_background=0.15)
+    img = (r.attr[0].cpu().numpy()*255).clip(0,255).astype('uint8')
+    imgs.append(cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
+grid = np.concatenate(imgs, axis=1)
+cv2.imwrite(out, grid)
+print(f'Saved {grid.shape[1]}x{grid.shape[0]} grid to {out}')

pipeline/tpose.py ADDED Viewed

	@@ -0,0 +1,332 @@

+"""
+tpose.py  —  T-pose a humanoid GLB using YOLO pose estimation.
+Pipeline:
+  1. Render the mesh from front view (azimuth=-90)
+  2. Run YOLOv8-pose to get 17 COCO keypoints in render-space
+  3. Unproject keypoints through the orthographic camera to 3D
+  4. Build Blender armature with bones at detected 3D joint positions (current pose)
+  5. Auto-weight skin the mesh to this armature
+  6. Rotate arm/leg bones to T-pose, apply deformation, export
+Usage:
+    blender --background --python tpose.py -- <input.glb> <output.glb>
+"""
+import bpy, sys, math, mathutils, os, tempfile
+import numpy as np
+# ── Args ─────────────────────────────────────────────────────────────────────
+argv = sys.argv
+argv = argv[argv.index("--") + 1:] if "--" in argv else []
+if len(argv) < 2:
+    print("Usage: blender --background --python tpose.py -- input.glb output.glb")
+    sys.exit(1)
+input_glb  = argv[0]
+output_glb = argv[1]
+# ── Step 1: Render front view using nvdiffrast (outside Blender) ───────────────
+# We do this via a subprocess call before Blender scene setup,
+# using the triposg Python env which has MV-Adapter + nvdiffrast.
+import subprocess, json
+TRIPOSG_PYTHON = '/root/miniconda/envs/triposg/bin/python'
+RENDER_SCRIPT = '/tmp/_tpose_render.py'
+RENDER_OUT    = '/tmp/_tpose_front.png'
+KP_OUT        = '/tmp/_tpose_kp.json'
+render_code = r"""
+import sys, json
+sys.path.insert(0, '/root/MV-Adapter')
+import numpy as np, cv2, torch
+from mvadapter.utils.mesh_utils import (
+    NVDiffRastContextWrapper, load_mesh, get_orthogonal_camera, render,
+)
+body_glb = sys.argv[1]
+out_png   = sys.argv[2]
+device = 'cuda'
+ctx     = NVDiffRastContextWrapper(device=device, context_type='cuda')
+mesh_mv = load_mesh(body_glb, rescale=True, device=device)
+camera  = get_orthogonal_camera(
+    elevation_deg=[0], distance=[1.8],
+    left=-0.55, right=0.55, bottom=-0.55, top=0.55,
+    azimuth_deg=[-90], device=device,
+)
+out = render(ctx, mesh_mv, camera, height=1024, width=768,
+             render_attr=True, render_depth=False, render_normal=False,
+             attr_background=0.5)
+img_np = (out.attr[0].cpu().numpy() * 255).clip(0,255).astype('uint8')
+cv2.imwrite(out_png, cv2.cvtColor(img_np, cv2.COLOR_RGB2BGR))
+print(f"Rendered to {out_png}")
+"""
+with open(RENDER_SCRIPT, 'w') as f:
+    f.write(render_code)
+print("[tpose] Rendering front view ...")
+r = subprocess.run([TRIPOSG_PYTHON, RENDER_SCRIPT, input_glb, RENDER_OUT],
+                   capture_output=True, text=True)
+print(r.stdout.strip()); print(r.stderr[-500:] if r.stderr else '')
+# ── Step 2: YOLO pose estimation ──────────────────────────────────────────────
+YOLO_SCRIPT = '/tmp/_tpose_yolo.py'
+yolo_code = r"""
+import sys, json
+import cv2
+from ultralytics import YOLO
+import numpy as np
+img_path = sys.argv[1]
+kp_path  = sys.argv[2]
+model = YOLO('yolov8n-pose.pt')
+img   = cv2.imread(img_path)
+H, W  = img.shape[:2]
+results = model(img, verbose=False)
+if not results or results[0].keypoints is None:
+    print("ERROR: no person detected"); sys.exit(1)
+# Pick detection with highest confidence
+kps_all = results[0].keypoints.data.cpu().numpy()  # (N, 17, 3)
+confs   = kps_all[:, :, 2].mean(axis=1)
+best    = kps_all[confs.argmax()]  # (17, 3): x, y, conf
+# COCO 17 keypoints:
+# 0=nose 1=left_eye 2=right_eye 3=left_ear 4=right_ear
+# 5=left_shoulder 6=right_shoulder 7=left_elbow 8=right_elbow
+# 9=left_wrist 10=right_wrist 11=left_hip 12=right_hip
+# 13=left_knee 14=right_knee 15=left_ankle 16=right_ankle
+names = ['nose','left_eye','right_eye','left_ear','right_ear',
+         'left_shoulder','right_shoulder','left_elbow','right_elbow',
+         'left_wrist','right_wrist','left_hip','right_hip',
+         'left_knee','right_knee','left_ankle','right_ankle']
+kp_dict = {}
+for i, name in enumerate(names):
+    x, y, c = best[i]
+    kp_dict[name] = {'x': float(x)/W, 'y': float(y)/H, 'conf': float(c)}
+    print(f"  {name}: ({x:.1f},{y:.1f}) conf={c:.2f}")
+kp_dict['img_hw'] = [int(H), int(W)]
+with open(kp_path, 'w') as f:
+    json.dump(kp_dict, f)
+print(f"Keypoints saved to {kp_path}")
+"""
+with open(YOLO_SCRIPT, 'w') as f:
+    f.write(yolo_code)
+print("[tpose] Running YOLO pose estimation ...")
+r2 = subprocess.run([TRIPOSG_PYTHON, YOLO_SCRIPT, RENDER_OUT, KP_OUT],
+                    capture_output=True, text=True)
+print(r2.stdout.strip()); print(r2.stderr[-300:] if r2.stderr else '')
+if not os.path.exists(KP_OUT):
+    print("ERROR: YOLO failed — falling back to heuristic")
+    kp_data = None
+else:
+    with open(KP_OUT) as f:
+        kp_data = json.load(f)
+# ── Step 3: Unproject render-space keypoints to 3D ────────────────────────────
+# Orthographic camera: left=-0.55, right=0.55, bottom=-0.55, top=0.55
+# Render: 768×1024.  NDC x = 2*(px/W)-1, ndc y = 1-2*(py/H)
+# World X = ndc_x * 0.55, World Y (mesh up) = ndc_y * 0.55
+# We need 3D positions in the ORIGINAL mesh coordinate space.
+# After Blender GLB import, original mesh Y → Blender Z, original Z → Blender -Y
+def kp_to_3d(name, z_default=0.0):
+    """Convert YOLO keypoint (image fraction) → Blender 3D coords."""
+    if kp_data is None or name not in kp_data:
+        return None
+    k = kp_data[name]
+    if k['conf'] < 0.3:
+        return None
+    # Image coords (fractions) → NDC
+    ndc_x =  2 * k['x'] - 1.0      # left→right  = mesh X
+    ndc_y = -(2 * k['y'] - 1.0)    # top→bottom  = mesh Y (up)
+    # Orthographic: frustum ±0.55
+    mesh_x = ndc_x * 0.55
+    mesh_y = ndc_y * 0.55           # this is mesh-space Y (vertical)
+    # After GLB import: mesh Y → Blender Z, mesh Z → Blender -Y
+    bl_x = mesh_x
+    bl_z = mesh_y           # height
+    bl_y = z_default        # depth (not observable from front view)
+    return (bl_x, bl_y, bl_z)
+# Key joint positions in Blender space
+J = {}
+for name in ['nose','left_shoulder','right_shoulder','left_elbow','right_elbow',
+             'left_wrist','right_wrist','left_hip','right_hip',
+             'left_knee','right_knee','left_ankle','right_ankle']:
+    p = kp_to_3d(name)
+    if p: J[name] = p
+print(f"[tpose] Detected joints: {list(J.keys())}")
+# ── Step 4: Set up Blender scene ──────────────────────────────────────────────
+bpy.ops.wm.read_factory_settings(use_empty=True)
+bpy.ops.import_scene.gltf(filepath=input_glb)
+bpy.context.view_layer.update()
+mesh_obj = next((o for o in bpy.data.objects if o.type == 'MESH'), None)
+if not mesh_obj:
+    print("ERROR: no mesh"); sys.exit(1)
+verts_w = np.array([mesh_obj.matrix_world @ v.co for v in mesh_obj.data.vertices])
+z_min, z_max = verts_w[:,2].min(), verts_w[:,2].max()
+x_c = (verts_w[:,0].min() + verts_w[:,0].max()) / 2
+y_c = (verts_w[:,1].min() + verts_w[:,1].max()) / 2
+H_mesh = z_max - z_min
+def zh(frac): return z_min + frac * H_mesh
+def jv(name, fallback_frac=None, fallback_x=0.0):
+    """Get joint position from YOLO or use fallback."""
+    if name in J:
+        x, y, z = J[name]
+        return (x, y_c, z)  # use mesh y_c for depth
+    if fallback_frac is not None:
+        return (x_c + fallback_x, y_c, zh(fallback_frac))
+    return None
+# ── Step 5: Build armature in CURRENT pose ────────────────────────────────────
+bpy.ops.object.armature_add(location=(x_c, y_c, zh(0.5)))
+arm_obj = bpy.context.object
+arm_obj.name = 'PoseRig'
+arm = arm_obj.data
+bpy.ops.object.mode_set(mode='EDIT')
+eb = arm.edit_bones
+def V(xyz): return mathutils.Vector(xyz)
+def add_bone(name, head, tail, parent=None, connect=False):
+    b = eb.new(name)
+    b.head = V(head)
+    b.tail = V(tail)
+    if parent and parent in eb:
+        b.parent = eb[parent]
+        b.use_connect = connect
+    return b
+# Helper: midpoint
+def mid(a, b): return tuple((a[i]+b[i])/2 for i in range(3))
+def offset(p, dx=0, dy=0, dz=0): return (p[0]+dx, p[1]+dy, p[2]+dz)
+# ── Spine / hips ─────────────────────────────────────────────────────────────
+hip_L  = jv('left_hip',  0.48, -0.07)
+hip_R  = jv('right_hip', 0.48,  0.07)
+sh_L   = jv('left_shoulder',  0.77, -0.20)
+sh_R   = jv('right_shoulder', 0.77,  0.20)
+nose   = jv('nose', 0.92)
+hips_c = mid(hip_L, hip_R) if (hip_L and hip_R) else (x_c, y_c, zh(0.48))
+sh_c   = mid(sh_L, sh_R)   if (sh_L and sh_R)   else (x_c, y_c, zh(0.77))
+add_bone('Hips',  hips_c, offset(hips_c, dz=H_mesh*0.08))
+add_bone('Spine', hips_c, offset(hips_c, dz=(sh_c[2]-hips_c[2])*0.5), 'Hips')
+add_bone('Chest', offset(hips_c, dz=(sh_c[2]-hips_c[2])*0.5), sh_c, 'Spine', True)
+if nose:
+    neck_z = sh_c[2] + (nose[2]-sh_c[2])*0.35
+    head_z = sh_c[2] + (nose[2]-sh_c[2])*0.65
+    add_bone('Neck', (x_c, y_c, neck_z), (x_c, y_c, head_z),  'Chest')
+    add_bone('Head', (x_c, y_c, head_z), (x_c, y_c, nose[2]+H_mesh*0.05), 'Neck', True)
+else:
+    add_bone('Neck', sh_c, offset(sh_c, dz=H_mesh*0.06), 'Chest')
+    add_bone('Head', offset(sh_c, dz=H_mesh*0.06), offset(sh_c, dz=H_mesh*0.14), 'Neck', True)
+# ── Arms (placed at DETECTED current pose positions) ─────────────────────────
+el_L  = jv('left_elbow',  0.60, -0.30)
+el_R  = jv('right_elbow', 0.60,  0.30)
+wr_L  = jv('left_wrist',  0.45, -0.25)
+wr_R  = jv('right_wrist', 0.45,  0.25)
+for side, sh, el, wr in (('L', sh_L, el_L, wr_L), ('R', sh_R, el_R, wr_R)):
+    if not sh: continue
+    el_pos = el if el else offset(sh, dz=-H_mesh*0.15)
+    wr_pos = wr if wr else offset(el_pos, dz=-H_mesh*0.15)
+    hand   = offset(wr_pos, dz=(wr_pos[2]-el_pos[2])*0.4)
+    add_bone(f'UpperArm.{side}', sh,     el_pos, 'Chest')
+    add_bone(f'ForeArm.{side}',  el_pos, wr_pos, f'UpperArm.{side}', True)
+    add_bone(f'Hand.{side}',     wr_pos, hand,   f'ForeArm.{side}',  True)
+# ── Legs ─────────────────────────────────────────────────────────────────────
+kn_L  = jv('left_knee',   0.25, -0.07)
+kn_R  = jv('right_knee',  0.25,  0.07)
+an_L  = jv('left_ankle',  0.04, -0.06)
+an_R  = jv('right_ankle', 0.04,  0.06)
+for side, hp, kn, an in (('L', hip_L, kn_L, an_L), ('R', hip_R, kn_R, an_R)):
+    if not hp: continue
+    kn_pos = kn if kn else offset(hp, dz=-H_mesh*0.23)
+    an_pos = an if an else offset(kn_pos, dz=-H_mesh*0.22)
+    toe    = offset(an_pos, dy=-H_mesh*0.06, dz=-H_mesh*0.02)
+    add_bone(f'UpperLeg.{side}', hp,     kn_pos, 'Hips')
+    add_bone(f'LowerLeg.{side}', kn_pos, an_pos, f'UpperLeg.{side}', True)
+    add_bone(f'Foot.{side}',     an_pos, toe,    f'LowerLeg.{side}', True)
+bpy.ops.object.mode_set(mode='OBJECT')
+# ── Step 6: Skin mesh to armature ────────────────────────────────────────────
+bpy.context.view_layer.objects.active = arm_obj
+mesh_obj.select_set(True)
+arm_obj.select_set(True)
+bpy.ops.object.parent_set(type='ARMATURE_AUTO')
+print("[tpose] Auto-weights applied")
+# ── Step 7: Pose arms to T-pose ───────────────────────────────────────────────
+# Compute per-arm rotation: from (current elbow - shoulder) direction → horizontal ±X
+bpy.context.view_layer.objects.active = arm_obj
+bpy.ops.object.mode_set(mode='POSE')
+pb = arm_obj.pose.bones
+def set_tpose_arm(side, sh_pos, el_pos):
+    if not sh_pos or not el_pos:
+        return
+    if f'UpperArm.{side}' not in pb:
+        return
+    # Current upper-arm direction in armature local space
+    sx = -1 if side == 'L' else 1
+    # T-pose direction: ±X horizontal
+    tpose_dir = mathutils.Vector((sx, 0, 0))
+    # Current bone direction (head→tail) in world space
+    bone = arm_obj.data.bones[f'UpperArm.{side}']
+    cur_dir = (bone.tail_local - bone.head_local).normalized()
+    # Rotation needed in bone's local space
+    rot_quat = cur_dir.rotation_difference(tpose_dir)
+    pb[f'UpperArm.{side}'].rotation_mode = 'QUATERNION'
+    pb[f'UpperArm.{side}'].rotation_quaternion = rot_quat
+    # Straighten forearm along the same axis
+    if f'ForeArm.{side}' in pb:
+        pb[f'ForeArm.{side}'].rotation_mode = 'QUATERNION'
+        pb[f'ForeArm.{side}'].rotation_quaternion = mathutils.Quaternion((1,0,0,0))
+set_tpose_arm('L', sh_L, el_L)
+set_tpose_arm('R', sh_R, el_R)
+bpy.context.view_layer.update()
+bpy.ops.object.mode_set(mode='OBJECT')
+# ── Step 8: Apply armature modifier ──────────────────────────────────────────
+bpy.context.view_layer.objects.active = mesh_obj
+mesh_obj.select_set(True)
+for mod in mesh_obj.modifiers:
+    if mod.type == 'ARMATURE':
+        bpy.ops.object.modifier_apply(modifier=mod.name)
+        print(f"[tpose] Applied modifier: {mod.name}")
+        break
+bpy.data.objects.remove(arm_obj, do_unlink=True)
+# ── Step 9: Export ────────────────────────────────────────────────────────────
+bpy.ops.export_scene.gltf(
+    filepath=output_glb, export_format='GLB',
+    export_texcoords=True, export_normals=True,
+    export_materials='EXPORT', use_selection=False)
+print(f"[tpose] Done → {output_glb}")

requirements.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-# HuggingFace ZeroGPU Space — Gradio SDK  [cache-bust: 2]
 spaces
 numpy>=2
@@ -79,3 +79,15 @@ typeguard
 sentencepiece
 spandrel
 imageio

+# HuggingFace ZeroGPU Space — Gradio SDK  [cache-bust: 3]
 spaces
 numpy>=2
 sentencepiece
 spandrel
 imageio
+gradio_client
+# FireRed / GGUF quantization
+bitsandbytes
+# Motion search + retargeting
+filterpy
+pytorch-lightning
+lightning-utilities
+webdataset
+hydra-core
+matplotlib

utils/pytorch3d_minimal.py ADDED Viewed

	@@ -0,0 +1,242 @@

+"""
+pytorch3d_minimal.py
+====================
+Drop-in replacement for the pytorch3d subset used by PSHuman's project_mesh.py
+and mesh_utils.py.  Uses nvdiffrast for GPU rasterization.
+Implements:
+  - Meshes / TexturesVertex
+  - look_at_view_transform
+  - FoVOrthographicCameras / OrthographicCameras (orthographic projection only)
+  - RasterizationSettings / MeshRasterizer  (via nvdiffrast)
+  - render_pix2faces_py3d  (compatibility shim)
+"""
+from __future__ import annotations
+import math
+import torch
+import torch.nn.functional as F
+import numpy as np
+# ---------------------------------------------------------------------------
+# Texture / Mesh containers
+# ---------------------------------------------------------------------------
+class TexturesVertex:
+    def __init__(self, verts_features):
+        # verts_features: list of [N, C] tensors  (one per mesh in batch)
+        self._feats = verts_features
+    def verts_features_packed(self):
+        return self._feats[0]
+    def clone(self):
+        return TexturesVertex([f.clone() for f in self._feats])
+    def detach(self):
+        return TexturesVertex([f.detach() for f in self._feats])
+    def to(self, device):
+        self._feats = [f.to(device) for f in self._feats]
+        return self
+class Meshes:
+    def __init__(self, verts, faces, textures=None):
+        self._verts = verts   # list of [N,3] float tensors
+        self._faces = faces   # list of [F,3] long tensors
+        self.textures = textures
+    # ---- accessors --------------------------------------------------------
+    def verts_padded(self):  return torch.stack(self._verts)
+    def faces_padded(self):  return torch.stack(self._faces)
+    def verts_packed(self):  return self._verts[0]
+    def faces_packed(self):  return self._faces[0]
+    def verts_list(self):    return self._verts
+    def faces_list(self):    return self._faces
+    def verts_normals_packed(self):
+        v, f = self._verts[0], self._faces[0]
+        v0, v1, v2 = v[f[:, 0]], v[f[:, 1]], v[f[:, 2]]
+        fn = torch.cross(v1 - v0, v2 - v0, dim=1)
+        fn = F.normalize(fn, dim=1)
+        vn = torch.zeros_like(v)
+        for k in range(3):
+            vn.scatter_add_(0, f[:, k:k+1].expand(-1, 3), fn)
+        return F.normalize(vn, dim=1)
+    # ---- device / copy ----------------------------------------------------
+    def to(self, device):
+        self._verts = [v.to(device) for v in self._verts]
+        self._faces = [f.to(device) for f in self._faces]
+        if self.textures is not None:
+            self.textures.to(device)
+        return self
+    def clone(self):
+        m = Meshes([v.clone() for v in self._verts],
+                   [f.clone() for f in self._faces])
+        if self.textures is not None:
+            m.textures = self.textures.clone()
+        return m
+    def detach(self):
+        m = Meshes([v.detach() for v in self._verts],
+                   [f.detach() for f in self._faces])
+        if self.textures is not None:
+            m.textures = self.textures.detach()
+        return m
+# ---------------------------------------------------------------------------
+# Camera math  (mirrors pytorch3d look_at_view_transform + Orthographic)
+# ---------------------------------------------------------------------------
+def _look_at_rotation(camera_pos: torch.Tensor,
+                      at: torch.Tensor,
+                      up: torch.Tensor) -> torch.Tensor:
+    """Return (3,3) rotation matrix: world → camera."""
+    z = F.normalize(camera_pos - at, dim=-1)          # cam looks along -Z
+    x = F.normalize(torch.cross(up, z, dim=-1), dim=-1)
+    y = torch.cross(z, x, dim=-1)
+    R = torch.stack([x, y, z], dim=-1)               # columns = cam axes
+    return R                                           # shape (3,3)
+def look_at_view_transform(dist=1.0, elev=0.0, azim=0.0,
+                            degrees=True, device="cpu"):
+    """Matches pytorch3d convention exactly."""
+    if degrees:
+        elev = math.radians(float(elev))
+        azim = math.radians(float(azim))
+    # camera position in world
+    cx = dist * math.cos(elev) * math.sin(azim)
+    cy = dist * math.sin(elev)
+    cz = dist * math.cos(elev) * math.cos(azim)
+    eye = torch.tensor([[cx, cy, cz]], dtype=torch.float32, device=device)
+    at  = torch.zeros(1, 3, device=device)
+    up  = torch.tensor([[0, 1, 0]], dtype=torch.float32, device=device)
+    # pytorch3d stores R transposed (row = cam axis in world space)
+    R = _look_at_rotation(eye[0], at[0], up[0]).T.unsqueeze(0)  # (1,3,3)
+    # T = camera position expressed in camera space
+    T = torch.bmm(-R, eye.unsqueeze(-1)).squeeze(-1)             # (1,3)
+    return R, T
+class _OrthoCamera:
+    """Minimal orthographic camera, matches FoVOrthographicCameras API."""
+    def __init__(self, R, T, focal_length=1.0, device="cpu"):
+        self.R = R.to(device)   # (B,3,3)
+        self.T = T.to(device)   # (B,3)
+        self.focal = float(focal_length)
+        self.device = device
+    def to(self, device):
+        self.R = self.R.to(device)
+        self.T = self.T.to(device)
+        self.device = device
+        return self
+    def get_znear(self):
+        return torch.tensor(0.01, device=self.device)
+    def is_perspective(self):
+        return False
+    def transform_points_ndc(self, points):
+        """
+        points: (B, N, 3) world coords
+        returns: (B, N, 3) NDC coords  (X,Y in [-1,1], Z = depth)
+        """
+        # world → camera
+        pts_cam = torch.bmm(points, self.R) + self.T.unsqueeze(1)   # (B,N,3)
+        # orthographic NDC: scale by focal, flip Y to match image convention
+        ndc_x =  pts_cam[..., 0] * self.focal
+        ndc_y = -pts_cam[..., 1] * self.focal   # pytorch3d flips Y
+        ndc_z =  pts_cam[..., 2]
+        return torch.stack([ndc_x, ndc_y, ndc_z], dim=-1)
+    def _world_to_clip(self, verts: torch.Tensor) -> torch.Tensor:
+        """verts: (N,3) → clip (N,4) for nvdiffrast."""
+        pts_cam = (verts @ self.R[0].T) + self.T[0]   # (N,3)
+        cx =  pts_cam[:, 0] * self.focal
+        cy = -pts_cam[:, 1] * self.focal               # flip Y
+        cz =  pts_cam[:, 2]
+        w  = torch.ones_like(cz)
+        return torch.stack([cx, cy, cz, w], dim=1)     # (N,4)
+# Aliases used in project_mesh.py
+def FoVOrthographicCameras(device="cpu", R=None, T=None,
+                            min_x=-1, max_x=1, min_y=-1, max_y=1,
+                            focal_length=None, **kwargs):
+    fl = focal_length if focal_length is not None else 1.0 / (max_x + 1e-9)
+    return _OrthoCamera(R, T, focal_length=fl, device=device)
+def FoVPerspectiveCameras(device="cpu", R=None, T=None, fov=60, degrees=True, **kwargs):
+    # Fallback: treat as orthographic at fov-derived scale (good enough for PSHuman)
+    fl = 1.0 / math.tan(math.radians(fov / 2)) if degrees else 1.0 / math.tan(fov / 2)
+    return _OrthoCamera(R, T, focal_length=fl, device=device)
+OrthographicCameras = FoVOrthographicCameras
+# ---------------------------------------------------------------------------
+# Rasterizer  (nvdiffrast-based)
+# ---------------------------------------------------------------------------
+class RasterizationSettings:
+    def __init__(self, image_size=512, blur_radius=0.0, faces_per_pixel=1):
+        if isinstance(image_size, (list, tuple)):
+            self.H, self.W = image_size[0], image_size[1]
+        else:
+            self.H = self.W = int(image_size)
+class _Fragments:
+    def __init__(self, pix_to_face):
+        self.pix_to_face = pix_to_face.unsqueeze(-1)  # (1,H,W,1)
+class MeshRasterizer:
+    def __init__(self, cameras=None, raster_settings=None):
+        self.cameras = cameras
+        self.settings = raster_settings
+        self._glctx = None
+    def _get_ctx(self, device):
+        if self._glctx is None:
+            import nvdiffrast.torch as dr
+            self._glctx = dr.RasterizeCudaContext(device=device)
+        return self._glctx
+    def __call__(self, meshes: Meshes, cameras=None):
+        cam = cameras or self.cameras
+        H, W = self.settings.H, self.settings.W
+        device = meshes.verts_packed().device
+        import nvdiffrast.torch as dr
+        glctx = self._get_ctx(str(device))
+        verts = meshes.verts_packed().to(device)
+        faces = meshes.faces_packed().to(torch.int32).to(device)
+        clip  = cam._world_to_clip(verts).unsqueeze(0)          # (1,N,4)
+        rast, _ = dr.rasterize(glctx, clip, faces, resolution=(H, W))
+        pix_to_face = rast[0, :, :, -1].to(torch.int32) - 1     # -1 = background
+        return _Fragments(pix_to_face.unsqueeze(0))
+# ---------------------------------------------------------------------------
+# render_pix2faces_py3d shim  (used in get_visible_faces)
+# ---------------------------------------------------------------------------
+def render_pix2faces_py3d(meshes, cameras, H=512, W=512, **kwargs):
+    """Returns {'pix_to_face': (1,H,W)} integer tensor of face indices (-1=bg)."""
+    settings = RasterizationSettings(image_size=(H, W))
+    rasterizer = MeshRasterizer(cameras=cameras, raster_settings=settings)
+    frags = rasterizer(meshes)
+    return {"pix_to_face": frags.pix_to_face[..., 0]}  # (1,H,W)