Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ Or use ComfyUI wrapper in our [github repo](https://github.com/IamCreateAI/Ruyi-
|
|
39 |
|
40 |
## Model Architecture
|
41 |
|
42 |
-
Ruyi-Mini-7B is an advanced image-to-video model with about 7.1 billion parameters. The model architecture is modified
|
43 |
1. Casual VAE Module: Handles video compression and decompression. It reduces spatial resolution to 1/8 and temporal resolution to 1/4, with each latent pixel is represented in 16-channel BF16 after compression.
|
44 |
2. Diffusion Transformer Module: Generates compressed video data using 3D full attention, with:
|
45 |
- 2D Normalized-RoPE for spatial dimensions;
|
|
|
39 |
|
40 |
## Model Architecture
|
41 |
|
42 |
+
Ruyi-Mini-7B is an advanced image-to-video model with about 7.1 billion parameters. The model architecture is modified from [EasyAnimate V4 model](https://github.com/aigc-apps/EasyAnimate), whose transformer module is inherited from [HunyuanDiT](https://github.com/Tencent/HunyuanDiT). It comprises three key components:
|
43 |
1. Casual VAE Module: Handles video compression and decompression. It reduces spatial resolution to 1/8 and temporal resolution to 1/4, with each latent pixel is represented in 16-channel BF16 after compression.
|
44 |
2. Diffusion Transformer Module: Generates compressed video data using 3D full attention, with:
|
45 |
- 2D Normalized-RoPE for spatial dimensions;
|