Anonymous Authors commited on
Commit
d268d2c
·
1 Parent(s): 343cdde

Point Model URL to renamed ViTeX-Edit-14B repo

Browse files

HEADER_MD link, About-tab companion list, and submissions.jsonl
code_url fields now all use huggingface.co/ViTeX-Bench/ViTeX-Edit-14B.

Files changed (3) hide show
  1. README.md +2 -2
  2. app.py +1 -1
  3. submissions.jsonl +2 -2
README.md CHANGED
@@ -16,7 +16,7 @@ short_description: Public leaderboard for video scene text editing.
16
  🌐 [Project page](https://vitex-bench.github.io/)  · 
17
  📊 [Dataset](https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset)  · 
18
  🧪 [Benchmark code](https://huggingface.co/ViTeX-Bench/ViTeX-Bench)  · 
19
- 🤖 [Model & Inference code](https://huggingface.co/ViTeX-Bench/ViTeX-14B)  · 
20
  🏆 Leaderboard
21
 
22
  Public ranking for **video scene text editing** under the 13-metric, three-axis protocol of [ViTeX-Bench](https://huggingface.co/ViTeX-Bench/ViTeX-Bench).
@@ -34,4 +34,4 @@ The full thirteen-metric vector is the unit of report. The table is sorted by **
34
  - 🌐 **Project page:** https://vitex-bench.github.io/
35
  - 📊 **Dataset:** https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset
36
  - 🧪 **Benchmark code:** https://huggingface.co/ViTeX-Bench/ViTeX-Bench
37
- - 🤖 **Model & Inference code** (ViTeX-Edit-14B): https://huggingface.co/ViTeX-Bench/ViTeX-14B
 
16
  🌐 [Project page](https://vitex-bench.github.io/)  · 
17
  📊 [Dataset](https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset)  · 
18
  🧪 [Benchmark code](https://huggingface.co/ViTeX-Bench/ViTeX-Bench)  · 
19
+ 🤖 [Model & Inference code](https://huggingface.co/ViTeX-Bench/ViTeX-Edit-14B)  · 
20
  🏆 Leaderboard
21
 
22
  Public ranking for **video scene text editing** under the 13-metric, three-axis protocol of [ViTeX-Bench](https://huggingface.co/ViTeX-Bench/ViTeX-Bench).
 
34
  - 🌐 **Project page:** https://vitex-bench.github.io/
35
  - 📊 **Dataset:** https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset
36
  - 🧪 **Benchmark code:** https://huggingface.co/ViTeX-Bench/ViTeX-Bench
37
+ - 🤖 **Model & Inference code** (ViTeX-Edit-14B): https://huggingface.co/ViTeX-Bench/ViTeX-Edit-14B
app.py CHANGED
@@ -720,7 +720,7 @@ HEADER_MD = (
720
  "🌐 [Project page](https://vitex-bench.github.io/)  ·  "
721
  "📊 [Dataset](https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset)  ·  "
722
  "🧪 [Benchmark code](https://huggingface.co/ViTeX-Bench/ViTeX-Bench)  ·  "
723
- "🤖 [Model & Inference code](https://huggingface.co/ViTeX-Bench/ViTeX-14B)  ·  "
724
  "🏆 **Leaderboard**\n\n"
725
  "Public ranking for **video scene text editing** under the 13-metric ViTeX-Bench "
726
  "protocol. Methods are ranked by **TextScore** = ∛(SeqAcc · CharAcc · TTS), "
 
720
  "🌐 [Project page](https://vitex-bench.github.io/)  ·  "
721
  "📊 [Dataset](https://huggingface.co/datasets/ViTeX-Bench/ViTeX-Dataset)  ·  "
722
  "🧪 [Benchmark code](https://huggingface.co/ViTeX-Bench/ViTeX-Bench)  ·  "
723
+ "🤖 [Model & Inference code](https://huggingface.co/ViTeX-Bench/ViTeX-Edit-14B)  ·  "
724
  "🏆 **Leaderboard**\n\n"
725
  "Public ranking for **video scene text editing** under the 13-metric ViTeX-Bench "
726
  "protocol. Methods are ranked by **TextScore** = ∛(SeqAcc · CharAcc · TTS), "
submissions.jsonl CHANGED
@@ -1,6 +1,6 @@
1
  {"method": "TextCtrl", "family": "A — per-frame image editor", "organization": "Zeng et al., 2024", "paper_url": "https://arxiv.org/abs/2410.10133", "code_url": "https://github.com/weichaozeng/TextCtrl", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5623872104295862, "SeqAcc": 0.47475474732914913, "CharAcc": 0.733502509720617, "TTS": 0.5107817672969821, "Flicker_full": 3.8040257787170293, "Flicker_crop": 4.287049075959627, "Warp_full": 1.5883564410079873, "Warp_crop": 2.087607682354121, "MUSIQ_full": 70.32216657276115, "MUSIQ_crop": 42.77286880553454, "PSNR_loc": 41.143448625451185, "SSIM_loc": 0.9944056776770688, "LPIPS_loc": 0.007969770003940647, "DreamSim_loc": 0.0042883308893049855, "n_clips": 157}
2
- {"method": "ViTeX-Edit-14B (Composite)", "family": "Reference", "organization": "Anonymous (NeurIPS 2026 D&B submission)", "paper_url": "", "code_url": "https://huggingface.co/ViTeX-Bench/ViTeX-14B", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5409733932921607, "SeqAcc": 0.34489601685376864, "CharAcc": 0.6892114595238374, "TTS": 0.6660196589960885, "Flicker_full": 3.730421558994263, "Flicker_crop": 3.826727295895001, "Warp_full": 1.5060892347506618, "Warp_crop": 1.5591366257464094, "MUSIQ_full": 70.27118598215141, "MUSIQ_crop": 44.944762801223376, "PSNR_loc": 42.950774276028774, "SSIM_loc": 0.9925085173386224, "LPIPS_loc": 0.005916571278483934, "DreamSim_loc": 0.0023257043362929306, "n_clips": 157}
3
- {"method": "ViTeX-Edit-14B", "family": "Reference", "organization": "Anonymous (NeurIPS 2026 D&B submission)", "paper_url": "", "code_url": "https://huggingface.co/ViTeX-Bench/ViTeX-14B", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5337670099905598, "SeqAcc": 0.34121246792876553, "CharAcc": 0.6879770723988642, "TTS": 0.6478229475144797, "Flicker_full": 3.2739301912212606, "Flicker_crop": 3.424670762474799, "Warp_full": 1.5515188207705402, "Warp_crop": 1.5304199630586097, "MUSIQ_full": 69.63500067777694, "MUSIQ_crop": 43.52961571422055, "PSNR_loc": 29.077432591849323, "SSIM_loc": 0.9512201399006257, "LPIPS_loc": 0.06030903690911814, "DreamSim_loc": 0.023522706465862867, "n_clips": 157}
4
  {"method": "VideoPainter", "family": "C — mask-conditioned video inpainting", "organization": "Bian et al., 2025", "paper_url": "https://arxiv.org/abs/2503.05639", "code_url": "https://github.com/TencentARC/VideoPainter", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.51506756458757, "SeqAcc": 0.364495972867382, "CharAcc": 0.6187952902754302, "TTS": 0.6058329243574021, "Flicker_full": 2.383399970485585, "Flicker_crop": 2.619418716169186, "Warp_full": 2.9276182078188366, "Warp_crop": 3.3452600061138558, "MUSIQ_full": 67.16001260609637, "MUSIQ_crop": 40.58771384010968, "PSNR_loc": 28.555957743164843, "SSIM_loc": 0.9151628155450829, "LPIPS_loc": 0.10402342236567201, "DreamSim_loc": 0.023908750937496278, "n_clips": 157}
5
  {"method": "FLUX-Text", "family": "A — per-frame image editor", "organization": "Chen et al., 2025", "paper_url": "https://arxiv.org/abs/2505.03329", "code_url": "https://github.com/AMAP-ML/FluxText", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5022999045945803, "SeqAcc": 0.5283744349135131, "CharAcc": 0.7367738685630717, "TTS": 0.32554668434702094, "Flicker_full": 5.114334507312302, "Flicker_crop": 14.81406893996351, "Warp_full": 3.0267581734528144, "Warp_crop": 13.009849474748862, "MUSIQ_full": 70.25921666161523, "MUSIQ_crop": 43.85439727157533, "PSNR_loc": 31.488873457756767, "SSIM_loc": 0.974685608615182, "LPIPS_loc": 0.028573400793733536, "DreamSim_loc": 0.012038936603600812, "n_clips": 157}
6
  {"method": "RS-STE", "family": "A — per-frame image editor", "organization": "Zhao et al., 2025", "paper_url": "https://arxiv.org/abs/2503.17774", "code_url": "https://github.com/honglei-zhao/RS-STE", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.4907994847015915, "SeqAcc": 0.3539735290248826, "CharAcc": 0.6258597181299173, "TTS": 0.5336598236730271, "Flicker_full": 3.728183996539611, "Flicker_crop": 3.66286053942277, "Warp_full": 1.6050723235894067, "Warp_crop": 1.8147908492065754, "MUSIQ_full": 69.57172569297175, "MUSIQ_crop": 34.26484699258097, "PSNR_loc": 37.00242438437832, "SSIM_loc": 0.9830883838061237, "LPIPS_loc": 0.02354780357549038, "DreamSim_loc": 0.007322213357421243, "n_clips": 157}
 
1
  {"method": "TextCtrl", "family": "A — per-frame image editor", "organization": "Zeng et al., 2024", "paper_url": "https://arxiv.org/abs/2410.10133", "code_url": "https://github.com/weichaozeng/TextCtrl", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5623872104295862, "SeqAcc": 0.47475474732914913, "CharAcc": 0.733502509720617, "TTS": 0.5107817672969821, "Flicker_full": 3.8040257787170293, "Flicker_crop": 4.287049075959627, "Warp_full": 1.5883564410079873, "Warp_crop": 2.087607682354121, "MUSIQ_full": 70.32216657276115, "MUSIQ_crop": 42.77286880553454, "PSNR_loc": 41.143448625451185, "SSIM_loc": 0.9944056776770688, "LPIPS_loc": 0.007969770003940647, "DreamSim_loc": 0.0042883308893049855, "n_clips": 157}
2
+ {"method": "ViTeX-Edit-14B (Composite)", "family": "Reference", "organization": "Anonymous (NeurIPS 2026 D&B submission)", "paper_url": "", "code_url": "https://huggingface.co/ViTeX-Bench/ViTeX-Edit-14B", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5409733932921607, "SeqAcc": 0.34489601685376864, "CharAcc": 0.6892114595238374, "TTS": 0.6660196589960885, "Flicker_full": 3.730421558994263, "Flicker_crop": 3.826727295895001, "Warp_full": 1.5060892347506618, "Warp_crop": 1.5591366257464094, "MUSIQ_full": 70.27118598215141, "MUSIQ_crop": 44.944762801223376, "PSNR_loc": 42.950774276028774, "SSIM_loc": 0.9925085173386224, "LPIPS_loc": 0.005916571278483934, "DreamSim_loc": 0.0023257043362929306, "n_clips": 157}
3
+ {"method": "ViTeX-Edit-14B", "family": "Reference", "organization": "Anonymous (NeurIPS 2026 D&B submission)", "paper_url": "", "code_url": "https://huggingface.co/ViTeX-Bench/ViTeX-Edit-14B", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5337670099905598, "SeqAcc": 0.34121246792876553, "CharAcc": 0.6879770723988642, "TTS": 0.6478229475144797, "Flicker_full": 3.2739301912212606, "Flicker_crop": 3.424670762474799, "Warp_full": 1.5515188207705402, "Warp_crop": 1.5304199630586097, "MUSIQ_full": 69.63500067777694, "MUSIQ_crop": 43.52961571422055, "PSNR_loc": 29.077432591849323, "SSIM_loc": 0.9512201399006257, "LPIPS_loc": 0.06030903690911814, "DreamSim_loc": 0.023522706465862867, "n_clips": 157}
4
  {"method": "VideoPainter", "family": "C — mask-conditioned video inpainting", "organization": "Bian et al., 2025", "paper_url": "https://arxiv.org/abs/2503.05639", "code_url": "https://github.com/TencentARC/VideoPainter", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.51506756458757, "SeqAcc": 0.364495972867382, "CharAcc": 0.6187952902754302, "TTS": 0.6058329243574021, "Flicker_full": 2.383399970485585, "Flicker_crop": 2.619418716169186, "Warp_full": 2.9276182078188366, "Warp_crop": 3.3452600061138558, "MUSIQ_full": 67.16001260609637, "MUSIQ_crop": 40.58771384010968, "PSNR_loc": 28.555957743164843, "SSIM_loc": 0.9151628155450829, "LPIPS_loc": 0.10402342236567201, "DreamSim_loc": 0.023908750937496278, "n_clips": 157}
5
  {"method": "FLUX-Text", "family": "A — per-frame image editor", "organization": "Chen et al., 2025", "paper_url": "https://arxiv.org/abs/2505.03329", "code_url": "https://github.com/AMAP-ML/FluxText", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.5022999045945803, "SeqAcc": 0.5283744349135131, "CharAcc": 0.7367738685630717, "TTS": 0.32554668434702094, "Flicker_full": 5.114334507312302, "Flicker_crop": 14.81406893996351, "Warp_full": 3.0267581734528144, "Warp_crop": 13.009849474748862, "MUSIQ_full": 70.25921666161523, "MUSIQ_crop": 43.85439727157533, "PSNR_loc": 31.488873457756767, "SSIM_loc": 0.974685608615182, "LPIPS_loc": 0.028573400793733536, "DreamSim_loc": 0.012038936603600812, "n_clips": 157}
6
  {"method": "RS-STE", "family": "A — per-frame image editor", "organization": "Zhao et al., 2025", "paper_url": "https://arxiv.org/abs/2503.17774", "code_url": "https://github.com/honglei-zhao/RS-STE", "submitter": "admin", "submitted_at": "2026-05-04 00:00:00 UTC", "approved_at": "2026-05-04 00:00:00 UTC", "status": "approved", "TextScore": 0.4907994847015915, "SeqAcc": 0.3539735290248826, "CharAcc": 0.6258597181299173, "TTS": 0.5336598236730271, "Flicker_full": 3.728183996539611, "Flicker_crop": 3.66286053942277, "Warp_full": 1.6050723235894067, "Warp_crop": 1.8147908492065754, "MUSIQ_full": 69.57172569297175, "MUSIQ_crop": 34.26484699258097, "PSNR_loc": 37.00242438437832, "SSIM_loc": 0.9830883838061237, "LPIPS_loc": 0.02354780357549038, "DreamSim_loc": 0.007322213357421243, "n_clips": 157}