Video-Reason
/

VBVR-LTX2.3-diffsynth

Model card Files Files and versions

wruisi commited on 7 days ago

Commit

fdbd5d3

·

1 Parent(s): 2b16414

Update README

Files changed (1) hide show

README.md +5 -20

README.md CHANGED Viewed

@@ -55,10 +55,10 @@ The model was presented in the paper [A Very Big Video Reasoning Suite](https://
 | [VBVR-Wan2.2](https://huggingface.co/Video-Reason/VBVR-Wan2.2) | Wan2.2-I2V-A14B | Diffusers format |
 | [VBVR-Wan2.1-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.1-diffsynth) | Wan2.1-I2V-14B-720P | DiffSynth LoRA format |
 | [VBVR-Wan2.2-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.2-diffsynth) | Wan2.2-I2V-A14B | DiffSynth LoRA format |
-| [VBVR-LTX2.3-diffsynth](https://huggingface.co/Video-Reason/VBVR-LTX2.3-diffsynth) | LTX-Video-2.3 | DiffSynth LoRA format |
 ## Release Information
-VBVR-Wan2.1 is trained from Wan2.1-I2V-14B-720P without architectural modifications, as the goal of VBVR is to *investigate data scaling behavior* and provide *strong baseline models* for the video reasoning research community. Leveraging the VBVR-Dataset, which constitutes one of the largest video reasoning datasets to date, the VBVR model family achieved highest scores on VBVR-Bench.
 In this release, we present
 [**VBVR-Wan2.1**](https://huggingface.co/Video-Reason/VBVR-Wan2.1) (Diffusers format),
@@ -177,24 +177,9 @@ In this release, we present
 ## QuickStart
-### Installation
-We recommend using [uv](https://docs.astral.sh/uv/) to manage the environment.
-> uv installation guide: <https://docs.astral.sh/uv/getting-started/installation/#installing-uv>
-```bash
-pip install torch>=2.4.0 torchvision>=0.19.0 transformers Pillow huggingface_hub[cli]
-uv pip install git+https://github.com/huggingface/diffusers
-```
-### Example Code
-```bash
-huggingface-cli download Video-Reason/VBVR-Wan2.1 --local-dir ./VBVR-Wan2.1
-python example.py \
-  --model_path ./VBVR-Wan2.1
-```
 ## Citation

 | [VBVR-Wan2.2](https://huggingface.co/Video-Reason/VBVR-Wan2.2) | Wan2.2-I2V-A14B | Diffusers format |
 | [VBVR-Wan2.1-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.1-diffsynth) | Wan2.1-I2V-14B-720P | DiffSynth LoRA format |
 | [VBVR-Wan2.2-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.2-diffsynth) | Wan2.2-I2V-A14B | DiffSynth LoRA format |
+| [VBVR-LTX2.3-diffsynth](https://huggingface.co/Video-Reason/VBVR-LTX2.3-diffsynth) | LTX-2.3 | DiffSynth LoRA format |
 ## Release Information
+VBVR-LTX2.3 is trained from LTX-2.3 without architectural modifications, as the goal of VBVR is to *investigate data scaling behavior* and provide *strong baseline models* for the video reasoning research community. Leveraging the VBVR-Dataset, which constitutes one of the largest video reasoning datasets to date, the VBVR model family achieved highest scores on VBVR-Bench.
 In this release, we present
 [**VBVR-Wan2.1**](https://huggingface.co/Video-Reason/VBVR-Wan2.1) (Diffusers format),
 ## QuickStart
+### Inference
+For running inference, please refer to the [**official guide**](https://github.com/Video-Reason/VBVR-Wan2.2?tab=readme-ov-file#ltx-23-inference) in the VBVR-Wan2.2 GitHub repository.
+This repository contains the latest instructions, configurations, and examples for performing inference with the VBVR family models.
 ## Citation