Anonymous Authors commited on
Commit
9b9fe26
·
1 Parent(s): 446f008

Rename displayed model name to ViTeX-Edit-14B in the model card

Browse files

README heading, file-tree comments, and the Composite-variant
section heading all switched. Same change applied to the docstrings
of inference_example.py and make_corp_baseline.py and to the
'Loading ... trained weights' log line. Repository URL, the bundled
weights filename (vitex_14b.safetensors), and the local clone target
directory are intentionally unchanged.

Files changed (3) hide show
  1. README.md +5 -5
  2. inference_example.py +2 -2
  3. make_corp_baseline.py +3 -3
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  - diffusion
9
  ---
10
 
11
- # ViTeX-14B (Model & Inference code)
12
 
13
  🌐 [Project page](https://vitex-bench.github.io/)  · 
14
  📊 [Dataset](https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset)  · 
@@ -34,8 +34,8 @@ Open reference model for **video scene text editing**. Augments Wan2.1-VACE-14B
34
 
35
  ```
36
  .
37
- ├── inference_example.py run ViTeX-14B on one (video, mask, glyph) tuple
38
- ├── make_corp_baseline.py build the ViTeX-14B (Composite) variant
39
  ├── vitex_14b.safetensors (8 GB, trained adapter weights)
40
  ├── diffsynth/ bundled inference library
41
  └── base_model/ (70 GB, frozen DiT + T5-XXL + Wan VAE)
@@ -68,9 +68,9 @@ python inference_example.py \
68
  --output out.mp4
69
  ```
70
 
71
- ## Locality-preserving variant: ViTeX-14B (Composite)
72
 
73
- `make_corp_baseline.py` is a deterministic, training-free post-processing wrapper. Two per-frame operations: (1) Reinhard mean–variance LAB color matching against the source's local lighting; (2) signed-distance feathered alpha compositing onto the source. Inside the mask the result is the predicted glyphs (color-matched); outside the feather it is byte-identical to the source. Locality metrics rise to near-Identity while SeqAcc / CharAcc move within ~0.01 of raw ViTeX-14B.
74
 
75
  ```bash
76
  python make_corp_baseline.py \
 
8
  - diffusion
9
  ---
10
 
11
+ # ViTeX-Edit-14B (Model & Inference code)
12
 
13
  🌐 [Project page](https://vitex-bench.github.io/)  · 
14
  📊 [Dataset](https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset)  · 
 
34
 
35
  ```
36
  .
37
+ ├── inference_example.py run ViTeX-Edit-14B on one (video, mask, glyph) tuple
38
+ ├── make_corp_baseline.py build the ViTeX-Edit-14B (Composite) variant
39
  ├── vitex_14b.safetensors (8 GB, trained adapter weights)
40
  ├── diffsynth/ bundled inference library
41
  └── base_model/ (70 GB, frozen DiT + T5-XXL + Wan VAE)
 
68
  --output out.mp4
69
  ```
70
 
71
+ ## Locality-preserving variant: ViTeX-Edit-14B (Composite)
72
 
73
+ `make_corp_baseline.py` is a deterministic, training-free post-processing wrapper. Two per-frame operations: (1) Reinhard mean–variance LAB color matching against the source's local lighting; (2) signed-distance feathered alpha compositing onto the source. Inside the mask the result is the predicted glyphs (color-matched); outside the feather it is byte-identical to the source. Locality metrics rise to near-Identity while SeqAcc / CharAcc move within ~0.01 of raw ViTeX-Edit-14B.
74
 
75
  ```bash
76
  python make_corp_baseline.py \
inference_example.py CHANGED
@@ -1,5 +1,5 @@
1
  """
2
- ViTeX-14B inference example (self-contained).
3
 
4
  Assumes you cloned this HuggingFace repo and are running this script from the
5
  repo root. The bundled `diffsynth/` library, `vitex_14b.safetensors` weights,
@@ -119,7 +119,7 @@ def build_pipeline(device="cuda:0"):
119
  redirect_common_files=False,
120
  )
121
 
122
- print(f"Loading ViTeX-14B trained weights from {ADAPTER_CKPT}")
123
  state = load_state_dict(ADAPTER_CKPT)
124
  res = pipe.vace.load_state_dict(state, strict=False)
125
  print(f" loaded {len(state)} keys (missing {len(res.missing_keys)}, unexpected {len(res.unexpected_keys)})")
 
1
  """
2
+ ViTeX-Edit-14B inference example (self-contained).
3
 
4
  Assumes you cloned this HuggingFace repo and are running this script from the
5
  repo root. The bundled `diffsynth/` library, `vitex_14b.safetensors` weights,
 
119
  redirect_common_files=False,
120
  )
121
 
122
+ print(f"Loading ViTeX-Edit-14B trained weights from {ADAPTER_CKPT}")
123
  state = load_state_dict(ADAPTER_CKPT)
124
  res = pipe.vace.load_state_dict(state, strict=False)
125
  print(f" loaded {len(state)} keys (missing {len(res.missing_keys)}, unexpected {len(res.unexpected_keys)})")
make_corp_baseline.py CHANGED
@@ -1,7 +1,7 @@
1
- """Build the ViTeX-14B (Composite) baseline.
2
 
3
  For each test clip:
4
- 1. Read source video, ViTeX-14B prediction, and the dilated text mask.
5
  2. Color-correct the prediction inside the mask to match the source by
6
  Reinhard-style mean+std matching in LAB space, using a 20-px band just
7
  outside the mask as the reference (so the local lighting is captured).
@@ -148,7 +148,7 @@ def main():
148
  ap.add_argument("--records", required=True)
149
  ap.add_argument("--data_root", required=True)
150
  ap.add_argument("--pred_dir", required=True,
151
- help="Directory of ViTeX-14B raw predictions (e.g., ViTeX-14B_orig)")
152
  ap.add_argument("--out_dir", required=True,
153
  help="Where the corp baseline mp4s are written")
154
  ap.add_argument("--target_frames", type=int, default=120)
 
1
+ """Build the ViTeX-Edit-14B (Composite) baseline.
2
 
3
  For each test clip:
4
+ 1. Read source video, ViTeX-Edit-14B prediction, and the dilated text mask.
5
  2. Color-correct the prediction inside the mask to match the source by
6
  Reinhard-style mean+std matching in LAB space, using a 20-px band just
7
  outside the mask as the reference (so the local lighting is captured).
 
148
  ap.add_argument("--records", required=True)
149
  ap.add_argument("--data_root", required=True)
150
  ap.add_argument("--pred_dir", required=True,
151
+ help="Directory of ViTeX-Edit-14B raw predictions (e.g., ViTeX-Edit-14B_orig)")
152
  ap.add_argument("--out_dir", required=True,
153
  help="Where the corp baseline mp4s are written")
154
  ap.add_argument("--target_frames", type=int, default=120)