Video-to-Video
Diffusers
Safetensors
robotics
video-generation
diffusion
action-conditioned
dreamdojo
cosmos-predict2.5
Instructions to use Physis-AI/DreamDojo-G1-14B-Diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Physis-AI/DreamDojo-G1-14B-Diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Physis-AI/DreamDojo-G1-14B-Diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Upload folder using huggingface_hub
Browse files- README.md +63 -0
- action_processor/config.json +8 -0
- config.json +11 -0
- crossattn_adapter/config.json +5 -0
- crossattn_adapter/model.safetensors +3 -0
- lam/config.json +16 -0
- lam/model.safetensors +3 -0
- scheduler/scheduler_config.json +18 -0
- text_encoder/.gitattributes +35 -0
- text_encoder/README.md +377 -0
- text_encoder/chat_template.json +3 -0
- text_encoder/config.json +61 -0
- text_encoder/generation_config.json +12 -0
- text_encoder/model-00001-of-00004.safetensors +3 -0
- text_encoder/model-00002-of-00004.safetensors +3 -0
- text_encoder/model-00003-of-00004.safetensors +3 -0
- text_encoder/model-00004-of-00004.safetensors +3 -0
- text_encoder/model.safetensors.index.json +736 -0
- text_encoder/preprocessor_config.json +19 -0
- text_encoder/tokenizer.json +0 -0
- text_encoder/tokenizer_config.json +207 -0
- transformer/model-00001-of-00016.safetensors +3 -0
- transformer/model-00002-of-00016.safetensors +3 -0
- transformer/model-00003-of-00016.safetensors +3 -0
- transformer/model-00004-of-00016.safetensors +3 -0
- transformer/model-00005-of-00016.safetensors +3 -0
- transformer/model-00006-of-00016.safetensors +3 -0
- transformer/model-00007-of-00016.safetensors +3 -0
- transformer/model-00008-of-00016.safetensors +3 -0
- transformer/model-00009-of-00016.safetensors +3 -0
- transformer/model-00010-of-00016.safetensors +3 -0
- transformer/model-00011-of-00016.safetensors +3 -0
- transformer/model-00012-of-00016.safetensors +3 -0
- transformer/model-00013-of-00016.safetensors +3 -0
- transformer/model-00014-of-00016.safetensors +3 -0
- transformer/model-00015-of-00016.safetensors +3 -0
- transformer/model-00016-of-00016.safetensors +3 -0
- transformer/model.safetensors.index.json +742 -0
- vae/config.json +56 -0
- vae/diffusion_pytorch_model.safetensors +3 -0
README.md
ADDED
|
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: nvidia-open-model-license
|
| 4 |
+
license_link: https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
|
| 5 |
+
tags:
|
| 6 |
+
- robotics
|
| 7 |
+
- video-generation
|
| 8 |
+
- diffusion
|
| 9 |
+
- action-conditioned
|
| 10 |
+
- dreamdojo
|
| 11 |
+
- cosmos-predict2.5
|
| 12 |
+
library_name: diffusers
|
| 13 |
+
pipeline_tag: video-to-video
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# DreamDojo-G1-14B-Diffusers
|
| 17 |
+
|
| 18 |
+
Large model fine-tuned on G1. Part of the [DreamDojo](https://github.com/NVIDIA/DreamDojo) model family.
|
| 19 |
+
|
| 20 |
+
| | |
|
| 21 |
+
|---|---|
|
| 22 |
+
| **Size** | 14B |
|
| 23 |
+
| **Stage** | Post-training |
|
| 24 |
+
| **Architecture** | DiT (Diffusion Transformer) with AdaLN-LoRA |
|
| 25 |
+
| **Base** | Cosmos Predict 2.5 |
|
| 26 |
+
|
| 27 |
+
## Checkpoint Structure
|
| 28 |
+
|
| 29 |
+
```
|
| 30 |
+
DreamDojo-G1-14B-Diffusers/
|
| 31 |
+
├── transformer/ # DiT backbone (sharded safetensors)
|
| 32 |
+
├── crossattn_adapter/ # Text-to-DiT projection (100352 → 1024)
|
| 33 |
+
├── vae/ # AutoencoderKLWan (standard diffusers)
|
| 34 |
+
├── lam/ # Latent Action Model (710M params)
|
| 35 |
+
├── text_encoder/ # Cosmos-Reason1-7B
|
| 36 |
+
├── scheduler/ # FlowMatchEulerDiscreteScheduler
|
| 37 |
+
├── action_processor/ # DreamDojo-specific config
|
| 38 |
+
└── config.json
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
## Architecture
|
| 42 |
+
|
| 43 |
+
| | 14B |
|
| 44 |
+
|--|------|
|
| 45 |
+
| Model channels | 5120 |
|
| 46 |
+
| Transformer blocks | 36 |
|
| 47 |
+
| Attention heads | 40 |
|
| 48 |
+
| Patch size (spatial / temporal) | 2 / 1 |
|
| 49 |
+
| Action dim | 384 (unified) |
|
| 50 |
+
|
| 51 |
+
## Citation
|
| 52 |
+
|
| 53 |
+
```bibtex
|
| 54 |
+
@article{dreamdojo2025,
|
| 55 |
+
title={DreamDojo: Advancing Real-World Robot Policies Through Generated Interactive Environments},
|
| 56 |
+
author={NVIDIA},
|
| 57 |
+
year={2025}
|
| 58 |
+
}
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
## License
|
| 62 |
+
|
| 63 |
+
Please refer to the [NVIDIA DreamDojo](https://github.com/NVIDIA/DreamDojo) repository for license terms.
|
action_processor/config.json
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "DreamDojoActionProcessorConfig",
|
| 3 |
+
"_diffusers_version": "0.36.0",
|
| 4 |
+
"cfg_text_dropout": 0.2,
|
| 5 |
+
"cfg_video_dropout": 0.2,
|
| 6 |
+
"seed": 42,
|
| 7 |
+
"train_time_distribution": "logitnormal"
|
| 8 |
+
}
|
config.json
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "DreamDojoPipeline",
|
| 3 |
+
"model_name": "DreamDojo-G1-14B-Diffusers",
|
| 4 |
+
"transformer": "transformer",
|
| 5 |
+
"crossattn_adapter": "crossattn_adapter",
|
| 6 |
+
"vae": "vae",
|
| 7 |
+
"lam": "lam",
|
| 8 |
+
"text_encoder": "text_encoder",
|
| 9 |
+
"scheduler": "scheduler",
|
| 10 |
+
"action_processor": "action_processor"
|
| 11 |
+
}
|
crossattn_adapter/config.json
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_type": "dreamdojo_crossattn_adapter",
|
| 3 |
+
"in_channels": 100352,
|
| 4 |
+
"out_channels": 1024
|
| 5 |
+
}
|
crossattn_adapter/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8f0a86014aea78b4e03165b22429cb156871f658020285858f40ea9fe658771d
|
| 3 |
+
size 205523136
|
lam/config.json
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"DreamDojoLAM"
|
| 4 |
+
],
|
| 5 |
+
"dec_blocks": 24,
|
| 6 |
+
"dropout": 0.0,
|
| 7 |
+
"dtype": "bfloat16",
|
| 8 |
+
"enc_blocks": 24,
|
| 9 |
+
"in_dim": 3,
|
| 10 |
+
"latent_dim": 32,
|
| 11 |
+
"model_dim": 1024,
|
| 12 |
+
"model_type": "dreamdojo_lam",
|
| 13 |
+
"num_heads": 16,
|
| 14 |
+
"patch_size": 16,
|
| 15 |
+
"transformers_version": "4.57.3"
|
| 16 |
+
}
|
lam/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6b9a861d5ec2ded0283ea6f59fdf2fa8545b7e205ceb539f9188243dd8b16bd6
|
| 3 |
+
size 1419658488
|
scheduler/scheduler_config.json
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "FlowMatchEulerDiscreteScheduler",
|
| 3 |
+
"_diffusers_version": "0.36.0",
|
| 4 |
+
"base_image_seq_len": 256,
|
| 5 |
+
"base_shift": 0.5,
|
| 6 |
+
"invert_sigmas": false,
|
| 7 |
+
"max_image_seq_len": 4096,
|
| 8 |
+
"max_shift": 1.15,
|
| 9 |
+
"num_train_timesteps": 1000,
|
| 10 |
+
"shift": 5.0,
|
| 11 |
+
"shift_terminal": null,
|
| 12 |
+
"stochastic_sampling": false,
|
| 13 |
+
"time_shift_type": "exponential",
|
| 14 |
+
"use_beta_sigmas": false,
|
| 15 |
+
"use_dynamic_shifting": false,
|
| 16 |
+
"use_exponential_sigmas": false,
|
| 17 |
+
"use_karras_sigmas": false
|
| 18 |
+
}
|
text_encoder/.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
text_encoder/README.md
ADDED
|
@@ -0,0 +1,377 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: nvidia-open-model-license
|
| 4 |
+
license_link: >-
|
| 5 |
+
https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license
|
| 6 |
+
datasets:
|
| 7 |
+
- nvidia/Cosmos-Reason1-SFT-Dataset
|
| 8 |
+
- nvidia/Cosmos-Reason1-RL-Dataset
|
| 9 |
+
- nvidia/Cosmos-Reason1-Benchmark
|
| 10 |
+
library_name: transformers
|
| 11 |
+
language:
|
| 12 |
+
- en
|
| 13 |
+
base_model:
|
| 14 |
+
- Qwen/Qwen2.5-VL-7B-Instruct
|
| 15 |
+
tags:
|
| 16 |
+
- nvidia
|
| 17 |
+
- cosmos
|
| 18 |
+
pipeline_tag: image-text-to-text
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# **Cosmos-Reason1: Physical AI Common Sense and Embodied Reasoning Models**
|
| 22 |
+
|
| 23 |
+
[**Cosmos**](https://huggingface.co/collections/nvidia/cosmos-reason1-67c9e926206426008f1da1b7) | [**Code**](https://github.com/nvidia-cosmos/cosmos-reason1) | [**Paper**](https://arxiv.org/abs/2503.15558) | [**Paper Website**](https://research.nvidia.com/labs/dir/cosmos-reason1)
|
| 24 |
+
|
| 25 |
+
# Model Overview
|
| 26 |
+
|
| 27 |
+
## Description:
|
| 28 |
+
|
| 29 |
+
NVIDIA Cosmos Reason – an open, customizable, 7B-parameter reasoning vision language model (VLM) for physical AI and robotics - enables robots and vision AI agents to reason like humans, using prior knowledge, physics understanding and common sense to understand and act in the real world. This model understands space, time, and fundamental physics, and can serve as a planning model to reason what steps an embodied agent might take next.
|
| 30 |
+
|
| 31 |
+
Cosmos Reason excels at navigating the long tail of diverse scenarios of the physical world with spatial-temporal understanding. Cosmos Reason is post-trained with physical common sense and embodied reasoning data with supervised fine-tuning and reinforcement learning. It uses chain-of-thought reasoning capabilities to understand world dynamics without human annotations.
|
| 32 |
+
|
| 33 |
+
Given a video/image and a text prompt, the model first converts the video/image into tokens using a vision encoder and a special translator called a projector. These video tokens are combined with the text prompt and fed into the core model, which uses a mix of LLM modules and techniques. This enables the model to think step-by-step and provide detailed, logical responses.
|
| 34 |
+
|
| 35 |
+
Cosmos Reason can be used for robotics and physical AI applications including:
|
| 36 |
+
- Data curation and annotation — Enable developers to automate high-quality curation and annotation of massive, diverse training datasets.
|
| 37 |
+
- Robot planning and reasoning — Act as the brain for deliberate, methodical decision-making in a robot vision language action (VLA) model. Now robots such as humanoids and autonomous vehicles can interpret environments and given complex commands, break them down into tasks and execute them using common sense, even in unfamiliar environments.
|
| 38 |
+
- Video analytics AI agents — Extract valuable insights and perform root-cause analysis on massive volumes of video data. These agents can be used to analyze and understand recorded or live video streams across city and industrial operations.
|
| 39 |
+
|
| 40 |
+
The model is ready for commercial use.
|
| 41 |
+
|
| 42 |
+
**Model Developer**: NVIDIA
|
| 43 |
+
|
| 44 |
+
## Model Versions
|
| 45 |
+
|
| 46 |
+
The Cosmos-Reason1 includes the following model:
|
| 47 |
+
|
| 48 |
+
- [Cosmos-Reason1-7B](https://huggingface.co/nvidia/Cosmos-Reason1-7B): Given a text prompt and an input video, think and generate the answer with respect to the input text prompt and video.
|
| 49 |
+
|
| 50 |
+
### License:
|
| 51 |
+
|
| 52 |
+
This model is released under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). Additional Information: [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md).
|
| 53 |
+
|
| 54 |
+
For a custom license, please contact [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com).
|
| 55 |
+
|
| 56 |
+
Under the NVIDIA Open Model License, NVIDIA confirms:
|
| 57 |
+
|
| 58 |
+
* Models are commercially usable.
|
| 59 |
+
* You are free to create and distribute Derivative Models.
|
| 60 |
+
* NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
|
| 61 |
+
|
| 62 |
+
**Important Note**: If You bypass, disable, reduce the efficacy of, or circumvent any technical limitation, safety guardrail or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism (collectively “Guardrail”) contained in the Model without a substantially similar Guardrail appropriate for your use case, your rights under this Agreement [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license) will automatically terminate.
|
| 63 |
+
|
| 64 |
+
### Deployment Geography:
|
| 65 |
+
|
| 66 |
+
Global
|
| 67 |
+
|
| 68 |
+
### Use Case:
|
| 69 |
+
|
| 70 |
+
Physical AI: Space, time, fundamental physics understanding and embodied reasoning, encompassing robotics, and autonomous vehicles (AV).
|
| 71 |
+
|
| 72 |
+
### Release Date:
|
| 73 |
+
|
| 74 |
+
* Github: [05/17/2025](https://github.com/nvidia-cosmos/cosmos-reason1)
|
| 75 |
+
* Huggingface:
|
| 76 |
+
* [08/01/2025](https://huggingface.co/nvidia/Cosmos-Reason1-7B/commit/0caf724f837efea5e25bf6d5818dcdeec0a36604). Shipped a few improvements which include captions with temporal timestamp, Set of Mark prompting.
|
| 77 |
+
* [06/10/2025](https://huggingface.co/nvidia/Cosmos-Reason1-7B/commit/2464fff43c5c0bfb1916ac8c009feda4aed81be9). Enhanced critic capability for physical plausibility.
|
| 78 |
+
* [05/17/2025](https://huggingface.co/nvidia/Cosmos-Reason1-7B/commit/098a5bb62a1f4fc05e5c4ac89aae8005e301aa18). Initial release.
|
| 79 |
+
|
| 80 |
+
## Model Architecture:
|
| 81 |
+
|
| 82 |
+
Architecture Type: A Multi-modal LLM consists of a Vision Transformer (ViT) for vision encoder and a Dense Transformer model for LLM.
|
| 83 |
+
Network Architecture: Qwen2.5-VL-7B-Instruct.
|
| 84 |
+
|
| 85 |
+
Cosmos-Reason-7B is post-trained based on [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) and follows the same model architecture.
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
**Number of model parameters:**
|
| 89 |
+
|
| 90 |
+
Cosmos-Reason1-7B:<br>
|
| 91 |
+
* Vision Transformer (ViT): 675.76M (675,759,104)
|
| 92 |
+
* Language Model (LLM): 7.07B (7,070,619,136)
|
| 93 |
+
* Other components (output projection layer): 545.00M (544,997,376)
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
## Input
|
| 97 |
+
|
| 98 |
+
**Input Type(s)**: Text+Video/Image
|
| 99 |
+
|
| 100 |
+
**Input Format(s)**:
|
| 101 |
+
* Text: String
|
| 102 |
+
* Video: mp4
|
| 103 |
+
* Image: jpg
|
| 104 |
+
|
| 105 |
+
**Input Parameters**:
|
| 106 |
+
* Text: One-dimensional (1D)
|
| 107 |
+
* Video: Three-dimensional (3D)
|
| 108 |
+
* Image: Two-dimensional (2D)
|
| 109 |
+
|
| 110 |
+
**Other Properties Related to Input**:
|
| 111 |
+
* Use `FPS=4` for input video to match the training setup.
|
| 112 |
+
* Append `Answer the question in the following format: <think>\nyour reasoning\n</think>\n\n<answer>\nyour answer\n</answer>.` in the system prompt to encourage long chain-of-thought reasoning response.
|
| 113 |
+
|
| 114 |
+
## Output
|
| 115 |
+
|
| 116 |
+
**Output Type(s)**: Text
|
| 117 |
+
|
| 118 |
+
**Output Format**: String
|
| 119 |
+
|
| 120 |
+
**Output Parameters**: Text: One-dimensional (1D)
|
| 121 |
+
|
| 122 |
+
**Other Properties Related to Output**:
|
| 123 |
+
* Recommend using 4096 or more output max tokens to avoid truncation of long chain-of-thought response.
|
| 124 |
+
|
| 125 |
+
* Our AI model recognizes timestamps added at the bottom of each frame for accurate temporal localization.
|
| 126 |
+
|
| 127 |
+
* Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions. <br>
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
## Software Integration
|
| 131 |
+
|
| 132 |
+
**Runtime Engine(s):**
|
| 133 |
+
|
| 134 |
+
* [vLLM](https://github.com/vllm-project/vllm)
|
| 135 |
+
|
| 136 |
+
**Supported Hardware Microarchitecture Compatibility:**
|
| 137 |
+
|
| 138 |
+
* NVIDIA Blackwell
|
| 139 |
+
* NVIDIA Hopper
|
| 140 |
+
|
| 141 |
+
**Note**: We have only tested doing inference with BF16 precision.
|
| 142 |
+
|
| 143 |
+
**Operating System(s):**
|
| 144 |
+
|
| 145 |
+
* Linux (We have not tested on other operating systems.)
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
# Usage
|
| 149 |
+
|
| 150 |
+
See [Cosmos-Reason1](https://github.com/nvidia-cosmos/cosmos-reason1) for details.
|
| 151 |
+
* Post Training: [Cosmos-Reason1](https://github.com/nvidia-cosmos/cosmos-reason1) provides examples of supervised fine-tuning and reinforcement learning on embodied reasoning datasets.
|
| 152 |
+
|
| 153 |
+
## Training and Evaluation Sections:
|
| 154 |
+
### 05/17/2025
|
| 155 |
+
Please see our [technical paper](https://arxiv.org/pdf/2503.15558) for detailed evaluations on physical common sense and embodied reasoning. Part of the evaluation datasets are released under [Cosmos-Reason1-Benchmark](https://huggingface.co/datasets/nvidia/Cosmos-Reason1-Benchmark). The embodied reasoning datasets and benchmarks focus on the following areas: robotics (RoboVQA, BridgeDataV2, Agibot, RobFail), ego-centric human demonstration (HoloAssist), and Autonomous Vehicle (AV) driving video data. The AV dataset is collected and annotated by NVIDIA.
|
| 156 |
+
|
| 157 |
+
All datasets go through the data annotation process described in the technical paper to prepare training and evaluation data and annotations.
|
| 158 |
+
|
| 159 |
+
### 08/01/2025
|
| 160 |
+
We enhance the model capability with the augmented training data. PLM-Video-Human and Nexar are used to enable dense temporal captioning. Describe Anything is added to enhance a set of mark (SoM) prompting. We enrich data in intelligent transportation systems (ITS) and warehouse applications. Lastly, Visual Critics dataset contains a collection of AI generated videos from Cosmos-Predict2 and Wan2.1 with human annotations to describe the physical correctness in AI videos.
|
| 161 |
+
|
| 162 |
+
|
| 163 |
+
## Training Datasets:
|
| 164 |
+
|
| 165 |
+
**Data Collection Method**:
|
| 166 |
+
* RoboVQA: Hybrid: Automatic/Sensors
|
| 167 |
+
* BridgeDataV2: Automatic/Sensors
|
| 168 |
+
* AgiBot: Automatic/Sensors
|
| 169 |
+
* RoboFail: Automatic/Sensors
|
| 170 |
+
* HoloAssist: Human
|
| 171 |
+
* AV: Automatic/Sensors
|
| 172 |
+
* PLM-Video-Human: Human
|
| 173 |
+
* Nexar: Automatic/Sensors
|
| 174 |
+
* Describe Anything: Human
|
| 175 |
+
* ITS / Warehouse: Human, Automatic
|
| 176 |
+
* Visual Critics: Automatic
|
| 177 |
+
|
| 178 |
+
**Labeling Method**:
|
| 179 |
+
* RoboVQA: Hybrid: Human,Automated
|
| 180 |
+
* BridgeDataV2: Hybrid: Human,Automated
|
| 181 |
+
* AgiBot: Hybrid: Human,Automated
|
| 182 |
+
* RoboFail: Hybrid: Human,Automated
|
| 183 |
+
* HoloAssist: Hybrid: Human,Automated
|
| 184 |
+
* AV: Hybrid: Human,Automated
|
| 185 |
+
* PLM-Video-Human: Human,Automated
|
| 186 |
+
* Nexar: Human
|
| 187 |
+
* Describe Anything: Human,Automated
|
| 188 |
+
* ITS / Warehouse: Human, Automated
|
| 189 |
+
* Visual Critics: Human,Automated
|
| 190 |
+
|
| 191 |
+
|
| 192 |
+
# Evaluation Datasets:
|
| 193 |
+
|
| 194 |
+
**Data Collection Method**:
|
| 195 |
+
* RoboVQA: Hybrid: Automatic/Sensors
|
| 196 |
+
* BridgeDataV2: Automatic/Sensors
|
| 197 |
+
* AgiBot: Automatic/Sensors
|
| 198 |
+
* RoboFail: Automatic/Sensors
|
| 199 |
+
* HoloAssist: Human
|
| 200 |
+
* AV: Automatic/Sensors
|
| 201 |
+
|
| 202 |
+
|
| 203 |
+
**Labeling Method**:
|
| 204 |
+
* RoboVQA: Hybrid: Human,Automated
|
| 205 |
+
* BridgeDataV2: Hybrid: Human,Automated
|
| 206 |
+
* AgiBot: Hybrid: Human,Automated
|
| 207 |
+
* RoboFail: Hybrid: Human,Automated
|
| 208 |
+
* HoloAssist: Hybrid: Human,Automated
|
| 209 |
+
* AV: Hybrid: Human,Automated
|
| 210 |
+
|
| 211 |
+
|
| 212 |
+
**Metrics**:
|
| 213 |
+
We report the model accuracy on the embodied reasoning benchmark introduced in [Cosmos-Reason1](https://arxiv.org/abs/2503.15558). The results differ from those presented in Table 9 due to additional training aimed at supporting a broader range of Physical AI tasks beyond the benchmark.
|
| 214 |
+
| | [RoboVQA](https://robovqa.github.io/) | AV | [BridgeDataV2](https://rail-berkeley.github.io/bridgedata/)| [Agibot](https://github.com/OpenDriveLab/AgiBot-World)| [HoloAssist](https://holoassist.github.io/) | [RoboFail](https://robot-reflect.github.io/) | Average |
|
| 215 |
+
|--------------------|---------------------------------------------|----------|------------------------------------------------------|------------------------------------------------|------------------------------------------------|------------------------------------------------|------------------------------------------------|
|
| 216 |
+
| **Accuracy** | 87.3 | 70.8 | 63.7 | 48.9 | 62.7 | 57.2 | 65.1 |
|
| 217 |
+
|
| 218 |
+
## Dataset Format
|
| 219 |
+
Modality: Video (mp4) and Text
|
| 220 |
+
|
| 221 |
+
## Dataset Quantification
|
| 222 |
+
### 05/17/2025
|
| 223 |
+
We release the embodied reasoning data and benchmarks. Each data sample is a pair of video and text. The text annotations include understanding and reasoning annotations described in the Cosmos-Reason1 paper. Each video may have multiple text annotations. The quantity of the video and text pairs is described in the table below.
|
| 224 |
+
**The AV data is currently unavailable and will be uploaded soon!**
|
| 225 |
+
|
| 226 |
+
| | [RoboVQA](https://robovqa.github.io/) | AV | [BridgeDataV2](https://rail-berkeley.github.io/bridgedata/)| [Agibot](https://github.com/OpenDriveLab/AgiBot-World)| [HoloAssist](https://holoassist.github.io/) | [RoboFail](https://robot-reflect.github.io/) | Total Storage Size |
|
| 227 |
+
|--------------------|---------------------------------------------|----------|------------------------------------------------------|------------------------------------------------|------------------------------------------------|------------------------------------------------|--------------------|
|
| 228 |
+
| **SFT Data** | 1.14m | 24.7k | 258k | 38.9k | 273k | N/A | **300.6GB** |
|
| 229 |
+
| **RL Data** | 252 | 200 | 240 | 200 | 200 | N/A | **2.6GB** |
|
| 230 |
+
| **Benchmark Data** | 110 | 100 | 100 | 100 | 100 | 100 | **1.5GB** |
|
| 231 |
+
|
| 232 |
+
We release text annotations for all embodied reasoning datasets and videos for RoboVQA and AV datasets. For other datasets, users may download the source videos from the original data source and find corresponding video sources via the video names. The held-out RoboFail benchmark is released for measuring the generalization capability.
|
| 233 |
+
|
| 234 |
+
### 08/01/2025
|
| 235 |
+
| | [PLM-Video-Human](https://huggingface.co/datasets/facebook/PLM-Video-Human) | Nexar | [Describe Anything](https://huggingface.co/datasets/nvidia/describe-anything-dataset)| [ITS / Warehouse] | Visual Critics | Total Storage Size |
|
| 236 |
+
|------------------ |-----------------------------------------------------------------------------|-------------|--------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|--------------------|
|
| 237 |
+
| **SFT Data** | 39k | 240k | 178k | 24k | 24k | **2.6TB** |
|
| 238 |
+
|
| 239 |
+
|
| 240 |
+
|
| 241 |
+
## Inference:
|
| 242 |
+
**Test Hardware:** H100, A100, GB200 <br>
|
| 243 |
+
> [!NOTE]
|
| 244 |
+
> We suggest using `fps=4` for the input video and `max_tokens=4096` to avoid truncated response.
|
| 245 |
+
```python
|
| 246 |
+
from transformers import AutoProcessor
|
| 247 |
+
from vllm import LLM, SamplingParams
|
| 248 |
+
from qwen_vl_utils import process_vision_info
|
| 249 |
+
|
| 250 |
+
# You can also replace the MODEL_PATH by a safetensors folder path mentioned above
|
| 251 |
+
MODEL_PATH = "nvidia/Cosmos-Reason1-7B"
|
| 252 |
+
|
| 253 |
+
llm = LLM(
|
| 254 |
+
model=MODEL_PATH,
|
| 255 |
+
limit_mm_per_prompt={"image": 10, "video": 10},
|
| 256 |
+
)
|
| 257 |
+
|
| 258 |
+
sampling_params = SamplingParams(
|
| 259 |
+
temperature=0.6,
|
| 260 |
+
top_p=0.95,
|
| 261 |
+
repetition_penalty=1.05,
|
| 262 |
+
max_tokens=4096,
|
| 263 |
+
)
|
| 264 |
+
|
| 265 |
+
video_messages = [
|
| 266 |
+
{"role": "system", "content": "You are a helpful assistant. Answer the question in the following format: <think>\nyour reasoning\n</think>\n\n<answer>\nyour answer\n</answer>."},
|
| 267 |
+
{"role": "user", "content": [
|
| 268 |
+
{"type": "text", "text": (
|
| 269 |
+
"Is it safe to turn right?"
|
| 270 |
+
)
|
| 271 |
+
},
|
| 272 |
+
{
|
| 273 |
+
"type": "video",
|
| 274 |
+
"video": "file:///path/to/your/video.mp4",
|
| 275 |
+
"fps": 4,
|
| 276 |
+
}
|
| 277 |
+
]
|
| 278 |
+
},
|
| 279 |
+
]
|
| 280 |
+
|
| 281 |
+
# Here we use video messages as a demonstration
|
| 282 |
+
messages = video_messages
|
| 283 |
+
|
| 284 |
+
processor = AutoProcessor.from_pretrained(MODEL_PATH)
|
| 285 |
+
prompt = processor.apply_chat_template(
|
| 286 |
+
messages,
|
| 287 |
+
tokenize=False,
|
| 288 |
+
add_generation_prompt=True,
|
| 289 |
+
)
|
| 290 |
+
image_inputs, video_inputs, video_kwargs = process_vision_info(messages, return_video_kwargs=True)
|
| 291 |
+
|
| 292 |
+
mm_data = {}
|
| 293 |
+
if image_inputs is not None:
|
| 294 |
+
mm_data["image"] = image_inputs
|
| 295 |
+
if video_inputs is not None:
|
| 296 |
+
mm_data["video"] = video_inputs
|
| 297 |
+
|
| 298 |
+
llm_inputs = {
|
| 299 |
+
"prompt": prompt,
|
| 300 |
+
"multi_modal_data": mm_data,
|
| 301 |
+
|
| 302 |
+
# FPS will be returned in video_kwargs
|
| 303 |
+
"mm_processor_kwargs": video_kwargs,
|
| 304 |
+
}
|
| 305 |
+
|
| 306 |
+
outputs = llm.generate([llm_inputs], sampling_params=sampling_params)
|
| 307 |
+
generated_text = outputs[0].outputs[0].text
|
| 308 |
+
|
| 309 |
+
print(generated_text)
|
| 310 |
+
```
|
| 311 |
+
|
| 312 |
+
|
| 313 |
+
## Ethical Considerations
|
| 314 |
+
|
| 315 |
+
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
|
| 316 |
+
|
| 317 |
+
Users are responsible for model inputs and outputs. Users are responsible for ensuring safe integration of this model, including implementing guardrails as well as other safety mechanisms, prior to deployment.
|
| 318 |
+
|
| 319 |
+
For more detailed information on ethical considerations for this model, please see the subcards of Explainability, Bias, Safety & Security, and Privacy below.
|
| 320 |
+
|
| 321 |
+
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
|
| 322 |
+
|
| 323 |
+
### Plus Plus (++) Promise
|
| 324 |
+
|
| 325 |
+
We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been:
|
| 326 |
+
|
| 327 |
+
* Verified to comply with current applicable disclosure laws, regulations, and industry standards.
|
| 328 |
+
* Verified to comply with applicable privacy labeling requirements.
|
| 329 |
+
* Annotated to describe the collector/source (NVIDIA or a third-party).
|
| 330 |
+
* Characterized for technical limitations.
|
| 331 |
+
* Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests.
|
| 332 |
+
* Reviewed before release.
|
| 333 |
+
* Tagged for known restrictions and potential safety implications.
|
| 334 |
+
|
| 335 |
+
### Bias
|
| 336 |
+
|
| 337 |
+
| Field | Response |
|
| 338 |
+
| :--------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------- |
|
| 339 |
+
| Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing: | None |
|
| 340 |
+
| Measures taken to mitigate against unwanted bias: | The training video sources contain multiple physical embodiments and environments including human, car, single arm robot, bimanual robot in indoor and outdoor environments. By training on numerous and various physical interactions and curated datasets, we strive to provide a model that does not possess biases towards certain embodiments or environments. |
|
| 341 |
+
|
| 342 |
+
### Explainability
|
| 343 |
+
|
| 344 |
+
| Field | Response |
|
| 345 |
+
| :-------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------- |
|
| 346 |
+
| Intended Application & Domain: | Physical AI Reasoning |
|
| 347 |
+
| Model Type: | Transformer |
|
| 348 |
+
| Intended Users: | Physical AI developers |
|
| 349 |
+
| Output: | Text |
|
| 350 |
+
| Describe how the model works: | Given a video/image and a text prompt, the model first converts the video/image into tokens using a vision encoder and a special translator called a projector. These video tokens are combined with the text prompt and fed into the core model, which uses a mix of LLM modules and techniques. This enables the model to think step-by-step and provide detailed, logical responses. |
|
| 351 |
+
| Technical Limitations: | The model may not follow the video or text input accurately in challenging cases, where the input video shows complex scene composition and temporal dynamics. Examples of challenging scenes include: fast camera movements, overlapping human-object interactions, low lighting with high motion blur, and multiple people performing different actions simultaneously. |
|
| 352 |
+
| Verified to have met prescribed NVIDIA quality standards: | Yes |
|
| 353 |
+
| Performance Metrics: | Quantitative and Qualitative Evaluation. Cosmos-Reason1 proposes the embodied reasoning benchmark and physical common sense benchmark to evaluate accuracy with visual question answering. |
|
| 354 |
+
| Potential Known Risks: | The model's output can generate all forms of texts, including what may be considered toxic, offensive, or indecent. |
|
| 355 |
+
| Licensing: | [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). Additional Information: [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md). |
|
| 356 |
+
|
| 357 |
+
### Privacy
|
| 358 |
+
|
| 359 |
+
| Field | Response |
|
| 360 |
+
| :------------------------------------------------------------------ | :------------- |
|
| 361 |
+
| Generatable or reverse engineerable personal information? | None Known |
|
| 362 |
+
| Protected class data used to create this model? | None Known |
|
| 363 |
+
| Was consent obtained for any personal data used? | None Known |
|
| 364 |
+
| How often is dataset reviewed? | Before Release |
|
| 365 |
+
| Is there provenance for all datasets used in training? | Yes |
|
| 366 |
+
| Does data labeling (annotation, metadata) comply with privacy laws? | Yes |
|
| 367 |
+
| Applicable Privacy Policy | [NVIDIA Privacy Policy](https://www.nvidia.com/en-us/about-nvidia/privacy-policy) |
|
| 368 |
+
|
| 369 |
+
|
| 370 |
+
### Safety
|
| 371 |
+
|
| 372 |
+
| Field | Response |
|
| 373 |
+
| :---------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| 374 |
+
| Model Application(s): | Physical AI common sense understanding and embodied reasoning |
|
| 375 |
+
| Describe the life critical impact (if present). | None Known |
|
| 376 |
+
| Use Case Restrictions: | [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). Additional Information: [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md). |
|
| 377 |
+
| Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. Model checkpoints are made available on Hugging Face, and may become available on cloud providers' model catalog. |
|
text_encoder/chat_template.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"chat_template": "{% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n{% endif %}<|im_start|>{{ message['role'] }}\n{% if message['content'] is string %}{{ message['content'] }}<|im_end|>\n{% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|>\n{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}"
|
| 3 |
+
}
|
text_encoder/config.json
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"Qwen2_5_VLForConditionalGeneration"
|
| 4 |
+
],
|
| 5 |
+
"attention_dropout": 0.0,
|
| 6 |
+
"bos_token_id": 151643,
|
| 7 |
+
"eos_token_id": 151645,
|
| 8 |
+
"vision_start_token_id": 151652,
|
| 9 |
+
"vision_end_token_id": 151653,
|
| 10 |
+
"vision_token_id": 151654,
|
| 11 |
+
"image_token_id": 151655,
|
| 12 |
+
"video_token_id": 151656,
|
| 13 |
+
"hidden_act": "silu",
|
| 14 |
+
"hidden_size": 3584,
|
| 15 |
+
"initializer_range": 0.02,
|
| 16 |
+
"intermediate_size": 18944,
|
| 17 |
+
"max_position_embeddings": 128000,
|
| 18 |
+
"max_window_layers": 28,
|
| 19 |
+
"model_type": "qwen2_5_vl",
|
| 20 |
+
"num_attention_heads": 28,
|
| 21 |
+
"num_hidden_layers": 28,
|
| 22 |
+
"num_key_value_heads": 4,
|
| 23 |
+
"rms_norm_eps": 1e-06,
|
| 24 |
+
"rope_theta": 1000000.0,
|
| 25 |
+
"sliding_window": 32768,
|
| 26 |
+
"tie_word_embeddings": false,
|
| 27 |
+
"torch_dtype": "bfloat16",
|
| 28 |
+
"transformers_version": "4.41.2",
|
| 29 |
+
"use_cache": true,
|
| 30 |
+
"use_sliding_window": false,
|
| 31 |
+
"vision_config": {
|
| 32 |
+
"depth": 32,
|
| 33 |
+
"hidden_act": "silu",
|
| 34 |
+
"hidden_size": 1280,
|
| 35 |
+
"intermediate_size": 3420,
|
| 36 |
+
"num_heads": 16,
|
| 37 |
+
"in_chans": 3,
|
| 38 |
+
"out_hidden_size": 3584,
|
| 39 |
+
"patch_size": 14,
|
| 40 |
+
"spatial_merge_size": 2,
|
| 41 |
+
"spatial_patch_size": 14,
|
| 42 |
+
"window_size": 112,
|
| 43 |
+
"fullatt_block_indexes": [
|
| 44 |
+
7,
|
| 45 |
+
15,
|
| 46 |
+
23,
|
| 47 |
+
31
|
| 48 |
+
],
|
| 49 |
+
"tokens_per_second": 2,
|
| 50 |
+
"temporal_patch_size": 2
|
| 51 |
+
},
|
| 52 |
+
"rope_scaling": {
|
| 53 |
+
"type": "mrope",
|
| 54 |
+
"mrope_section": [
|
| 55 |
+
16,
|
| 56 |
+
24,
|
| 57 |
+
24
|
| 58 |
+
]
|
| 59 |
+
},
|
| 60 |
+
"vocab_size": 152064
|
| 61 |
+
}
|
text_encoder/generation_config.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"bos_token_id": 151643,
|
| 3 |
+
"pad_token_id": 151643,
|
| 4 |
+
"do_sample": true,
|
| 5 |
+
"eos_token_id": [
|
| 6 |
+
151645,
|
| 7 |
+
151643
|
| 8 |
+
],
|
| 9 |
+
"repetition_penalty": 1.05,
|
| 10 |
+
"temperature": 0.000001,
|
| 11 |
+
"transformers_version": "4.37.0"
|
| 12 |
+
}
|
text_encoder/model-00001-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c28404126221997ae8eb70a23b919c96174d42e35ae1d537e0c95093d50b359a
|
| 3 |
+
size 4968243304
|
text_encoder/model-00002-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f281081864c10992d3e03874c79d526c84407e049d713747f19eb9c79cd16db3
|
| 3 |
+
size 4991495816
|
text_encoder/model-00003-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3bb3a62a8d0e83c6283388ddea99395b221f908f9181b8edd0f7f91d02260ebe
|
| 3 |
+
size 4932751040
|
text_encoder/model-00004-of-00004.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:91bacbe1ad798e16daa05023b4e4bec70b53c8cd7d757db86c5bc76c4e0bbf15
|
| 3 |
+
size 1691924384
|
text_encoder/model.safetensors.index.json
ADDED
|
@@ -0,0 +1,736 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"metadata": {
|
| 3 |
+
"total_size": 16584333312
|
| 4 |
+
},
|
| 5 |
+
"weight_map": {
|
| 6 |
+
"visual.patch_embed.proj.weight": "model-00001-of-00004.safetensors",
|
| 7 |
+
"visual.blocks.0.norm1.weight": "model-00001-of-00004.safetensors",
|
| 8 |
+
"visual.blocks.0.norm2.weight": "model-00001-of-00004.safetensors",
|
| 9 |
+
"visual.blocks.0.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 10 |
+
"visual.blocks.0.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 11 |
+
"visual.blocks.0.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 12 |
+
"visual.blocks.0.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 13 |
+
"visual.blocks.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 14 |
+
"visual.blocks.0.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 15 |
+
"visual.blocks.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 16 |
+
"visual.blocks.0.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 17 |
+
"visual.blocks.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 18 |
+
"visual.blocks.0.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 19 |
+
"visual.blocks.1.norm1.weight": "model-00001-of-00004.safetensors",
|
| 20 |
+
"visual.blocks.1.norm2.weight": "model-00001-of-00004.safetensors",
|
| 21 |
+
"visual.blocks.1.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 22 |
+
"visual.blocks.1.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 23 |
+
"visual.blocks.1.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 24 |
+
"visual.blocks.1.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 25 |
+
"visual.blocks.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 26 |
+
"visual.blocks.1.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 27 |
+
"visual.blocks.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 28 |
+
"visual.blocks.1.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 29 |
+
"visual.blocks.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 30 |
+
"visual.blocks.1.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 31 |
+
"visual.blocks.2.norm1.weight": "model-00001-of-00004.safetensors",
|
| 32 |
+
"visual.blocks.2.norm2.weight": "model-00001-of-00004.safetensors",
|
| 33 |
+
"visual.blocks.2.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 34 |
+
"visual.blocks.2.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 35 |
+
"visual.blocks.2.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 36 |
+
"visual.blocks.2.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 37 |
+
"visual.blocks.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 38 |
+
"visual.blocks.2.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 39 |
+
"visual.blocks.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 40 |
+
"visual.blocks.2.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 41 |
+
"visual.blocks.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 42 |
+
"visual.blocks.2.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 43 |
+
"visual.blocks.3.norm1.weight": "model-00001-of-00004.safetensors",
|
| 44 |
+
"visual.blocks.3.norm2.weight": "model-00001-of-00004.safetensors",
|
| 45 |
+
"visual.blocks.3.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 46 |
+
"visual.blocks.3.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 47 |
+
"visual.blocks.3.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 48 |
+
"visual.blocks.3.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 49 |
+
"visual.blocks.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 50 |
+
"visual.blocks.3.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 51 |
+
"visual.blocks.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 52 |
+
"visual.blocks.3.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 53 |
+
"visual.blocks.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 54 |
+
"visual.blocks.3.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 55 |
+
"visual.blocks.4.norm1.weight": "model-00001-of-00004.safetensors",
|
| 56 |
+
"visual.blocks.4.norm2.weight": "model-00001-of-00004.safetensors",
|
| 57 |
+
"visual.blocks.4.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 58 |
+
"visual.blocks.4.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 59 |
+
"visual.blocks.4.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 60 |
+
"visual.blocks.4.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 61 |
+
"visual.blocks.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 62 |
+
"visual.blocks.4.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 63 |
+
"visual.blocks.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 64 |
+
"visual.blocks.4.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 65 |
+
"visual.blocks.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 66 |
+
"visual.blocks.4.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 67 |
+
"visual.blocks.5.norm1.weight": "model-00001-of-00004.safetensors",
|
| 68 |
+
"visual.blocks.5.norm2.weight": "model-00001-of-00004.safetensors",
|
| 69 |
+
"visual.blocks.5.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 70 |
+
"visual.blocks.5.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 71 |
+
"visual.blocks.5.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 72 |
+
"visual.blocks.5.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 73 |
+
"visual.blocks.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 74 |
+
"visual.blocks.5.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 75 |
+
"visual.blocks.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 76 |
+
"visual.blocks.5.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 77 |
+
"visual.blocks.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 78 |
+
"visual.blocks.5.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 79 |
+
"visual.blocks.6.norm1.weight": "model-00001-of-00004.safetensors",
|
| 80 |
+
"visual.blocks.6.norm2.weight": "model-00001-of-00004.safetensors",
|
| 81 |
+
"visual.blocks.6.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 82 |
+
"visual.blocks.6.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 83 |
+
"visual.blocks.6.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 84 |
+
"visual.blocks.6.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 85 |
+
"visual.blocks.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 86 |
+
"visual.blocks.6.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 87 |
+
"visual.blocks.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 88 |
+
"visual.blocks.6.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 89 |
+
"visual.blocks.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 90 |
+
"visual.blocks.6.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 91 |
+
"visual.blocks.7.norm1.weight": "model-00001-of-00004.safetensors",
|
| 92 |
+
"visual.blocks.7.norm2.weight": "model-00001-of-00004.safetensors",
|
| 93 |
+
"visual.blocks.7.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 94 |
+
"visual.blocks.7.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 95 |
+
"visual.blocks.7.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 96 |
+
"visual.blocks.7.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 97 |
+
"visual.blocks.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 98 |
+
"visual.blocks.7.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 99 |
+
"visual.blocks.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 100 |
+
"visual.blocks.7.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 101 |
+
"visual.blocks.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 102 |
+
"visual.blocks.7.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 103 |
+
"visual.blocks.8.norm1.weight": "model-00001-of-00004.safetensors",
|
| 104 |
+
"visual.blocks.8.norm2.weight": "model-00001-of-00004.safetensors",
|
| 105 |
+
"visual.blocks.8.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 106 |
+
"visual.blocks.8.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 107 |
+
"visual.blocks.8.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 108 |
+
"visual.blocks.8.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 109 |
+
"visual.blocks.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 110 |
+
"visual.blocks.8.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 111 |
+
"visual.blocks.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 112 |
+
"visual.blocks.8.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 113 |
+
"visual.blocks.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 114 |
+
"visual.blocks.8.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 115 |
+
"visual.blocks.9.norm1.weight": "model-00001-of-00004.safetensors",
|
| 116 |
+
"visual.blocks.9.norm2.weight": "model-00001-of-00004.safetensors",
|
| 117 |
+
"visual.blocks.9.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 118 |
+
"visual.blocks.9.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 119 |
+
"visual.blocks.9.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 120 |
+
"visual.blocks.9.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 121 |
+
"visual.blocks.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 122 |
+
"visual.blocks.9.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 123 |
+
"visual.blocks.9.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 124 |
+
"visual.blocks.9.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 125 |
+
"visual.blocks.9.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 126 |
+
"visual.blocks.9.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 127 |
+
"visual.blocks.10.norm1.weight": "model-00001-of-00004.safetensors",
|
| 128 |
+
"visual.blocks.10.norm2.weight": "model-00001-of-00004.safetensors",
|
| 129 |
+
"visual.blocks.10.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 130 |
+
"visual.blocks.10.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 131 |
+
"visual.blocks.10.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 132 |
+
"visual.blocks.10.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 133 |
+
"visual.blocks.10.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 134 |
+
"visual.blocks.10.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 135 |
+
"visual.blocks.10.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 136 |
+
"visual.blocks.10.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 137 |
+
"visual.blocks.10.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 138 |
+
"visual.blocks.10.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 139 |
+
"visual.blocks.11.norm1.weight": "model-00001-of-00004.safetensors",
|
| 140 |
+
"visual.blocks.11.norm2.weight": "model-00001-of-00004.safetensors",
|
| 141 |
+
"visual.blocks.11.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 142 |
+
"visual.blocks.11.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 143 |
+
"visual.blocks.11.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 144 |
+
"visual.blocks.11.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 145 |
+
"visual.blocks.11.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 146 |
+
"visual.blocks.11.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 147 |
+
"visual.blocks.11.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 148 |
+
"visual.blocks.11.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 149 |
+
"visual.blocks.11.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 150 |
+
"visual.blocks.11.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 151 |
+
"visual.blocks.12.norm1.weight": "model-00001-of-00004.safetensors",
|
| 152 |
+
"visual.blocks.12.norm2.weight": "model-00001-of-00004.safetensors",
|
| 153 |
+
"visual.blocks.12.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 154 |
+
"visual.blocks.12.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 155 |
+
"visual.blocks.12.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 156 |
+
"visual.blocks.12.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 157 |
+
"visual.blocks.12.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 158 |
+
"visual.blocks.12.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 159 |
+
"visual.blocks.12.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 160 |
+
"visual.blocks.12.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 161 |
+
"visual.blocks.12.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 162 |
+
"visual.blocks.12.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 163 |
+
"visual.blocks.13.norm1.weight": "model-00001-of-00004.safetensors",
|
| 164 |
+
"visual.blocks.13.norm2.weight": "model-00001-of-00004.safetensors",
|
| 165 |
+
"visual.blocks.13.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 166 |
+
"visual.blocks.13.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 167 |
+
"visual.blocks.13.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 168 |
+
"visual.blocks.13.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 169 |
+
"visual.blocks.13.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 170 |
+
"visual.blocks.13.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 171 |
+
"visual.blocks.13.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 172 |
+
"visual.blocks.13.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 173 |
+
"visual.blocks.13.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 174 |
+
"visual.blocks.13.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 175 |
+
"visual.blocks.14.norm1.weight": "model-00001-of-00004.safetensors",
|
| 176 |
+
"visual.blocks.14.norm2.weight": "model-00001-of-00004.safetensors",
|
| 177 |
+
"visual.blocks.14.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 178 |
+
"visual.blocks.14.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 179 |
+
"visual.blocks.14.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 180 |
+
"visual.blocks.14.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 181 |
+
"visual.blocks.14.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 182 |
+
"visual.blocks.14.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 183 |
+
"visual.blocks.14.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 184 |
+
"visual.blocks.14.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 185 |
+
"visual.blocks.14.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 186 |
+
"visual.blocks.14.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 187 |
+
"visual.blocks.15.norm1.weight": "model-00001-of-00004.safetensors",
|
| 188 |
+
"visual.blocks.15.norm2.weight": "model-00001-of-00004.safetensors",
|
| 189 |
+
"visual.blocks.15.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 190 |
+
"visual.blocks.15.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 191 |
+
"visual.blocks.15.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 192 |
+
"visual.blocks.15.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 193 |
+
"visual.blocks.15.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 194 |
+
"visual.blocks.15.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 195 |
+
"visual.blocks.15.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 196 |
+
"visual.blocks.15.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 197 |
+
"visual.blocks.15.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 198 |
+
"visual.blocks.15.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 199 |
+
"visual.blocks.16.norm1.weight": "model-00001-of-00004.safetensors",
|
| 200 |
+
"visual.blocks.16.norm2.weight": "model-00001-of-00004.safetensors",
|
| 201 |
+
"visual.blocks.16.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 202 |
+
"visual.blocks.16.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 203 |
+
"visual.blocks.16.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 204 |
+
"visual.blocks.16.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 205 |
+
"visual.blocks.16.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 206 |
+
"visual.blocks.16.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 207 |
+
"visual.blocks.16.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 208 |
+
"visual.blocks.16.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 209 |
+
"visual.blocks.16.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 210 |
+
"visual.blocks.16.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 211 |
+
"visual.blocks.17.norm1.weight": "model-00001-of-00004.safetensors",
|
| 212 |
+
"visual.blocks.17.norm2.weight": "model-00001-of-00004.safetensors",
|
| 213 |
+
"visual.blocks.17.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 214 |
+
"visual.blocks.17.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 215 |
+
"visual.blocks.17.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 216 |
+
"visual.blocks.17.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 217 |
+
"visual.blocks.17.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 218 |
+
"visual.blocks.17.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 219 |
+
"visual.blocks.17.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 220 |
+
"visual.blocks.17.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 221 |
+
"visual.blocks.17.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 222 |
+
"visual.blocks.17.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 223 |
+
"visual.blocks.18.norm1.weight": "model-00001-of-00004.safetensors",
|
| 224 |
+
"visual.blocks.18.norm2.weight": "model-00001-of-00004.safetensors",
|
| 225 |
+
"visual.blocks.18.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 226 |
+
"visual.blocks.18.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 227 |
+
"visual.blocks.18.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 228 |
+
"visual.blocks.18.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 229 |
+
"visual.blocks.18.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 230 |
+
"visual.blocks.18.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 231 |
+
"visual.blocks.18.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 232 |
+
"visual.blocks.18.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 233 |
+
"visual.blocks.18.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 234 |
+
"visual.blocks.18.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 235 |
+
"visual.blocks.19.norm1.weight": "model-00001-of-00004.safetensors",
|
| 236 |
+
"visual.blocks.19.norm2.weight": "model-00001-of-00004.safetensors",
|
| 237 |
+
"visual.blocks.19.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 238 |
+
"visual.blocks.19.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 239 |
+
"visual.blocks.19.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 240 |
+
"visual.blocks.19.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 241 |
+
"visual.blocks.19.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 242 |
+
"visual.blocks.19.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 243 |
+
"visual.blocks.19.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 244 |
+
"visual.blocks.19.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 245 |
+
"visual.blocks.19.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 246 |
+
"visual.blocks.19.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 247 |
+
"visual.blocks.20.norm1.weight": "model-00001-of-00004.safetensors",
|
| 248 |
+
"visual.blocks.20.norm2.weight": "model-00001-of-00004.safetensors",
|
| 249 |
+
"visual.blocks.20.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 250 |
+
"visual.blocks.20.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 251 |
+
"visual.blocks.20.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 252 |
+
"visual.blocks.20.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 253 |
+
"visual.blocks.20.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 254 |
+
"visual.blocks.20.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 255 |
+
"visual.blocks.20.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 256 |
+
"visual.blocks.20.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 257 |
+
"visual.blocks.20.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 258 |
+
"visual.blocks.20.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 259 |
+
"visual.blocks.21.norm1.weight": "model-00001-of-00004.safetensors",
|
| 260 |
+
"visual.blocks.21.norm2.weight": "model-00001-of-00004.safetensors",
|
| 261 |
+
"visual.blocks.21.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 262 |
+
"visual.blocks.21.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 263 |
+
"visual.blocks.21.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 264 |
+
"visual.blocks.21.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 265 |
+
"visual.blocks.21.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 266 |
+
"visual.blocks.21.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 267 |
+
"visual.blocks.21.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 268 |
+
"visual.blocks.21.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 269 |
+
"visual.blocks.21.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 270 |
+
"visual.blocks.21.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 271 |
+
"visual.blocks.22.norm1.weight": "model-00001-of-00004.safetensors",
|
| 272 |
+
"visual.blocks.22.norm2.weight": "model-00001-of-00004.safetensors",
|
| 273 |
+
"visual.blocks.22.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 274 |
+
"visual.blocks.22.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 275 |
+
"visual.blocks.22.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 276 |
+
"visual.blocks.22.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 277 |
+
"visual.blocks.22.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 278 |
+
"visual.blocks.22.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 279 |
+
"visual.blocks.22.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 280 |
+
"visual.blocks.22.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 281 |
+
"visual.blocks.22.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 282 |
+
"visual.blocks.22.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 283 |
+
"visual.blocks.23.norm1.weight": "model-00001-of-00004.safetensors",
|
| 284 |
+
"visual.blocks.23.norm2.weight": "model-00001-of-00004.safetensors",
|
| 285 |
+
"visual.blocks.23.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 286 |
+
"visual.blocks.23.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 287 |
+
"visual.blocks.23.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 288 |
+
"visual.blocks.23.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 289 |
+
"visual.blocks.23.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 290 |
+
"visual.blocks.23.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 291 |
+
"visual.blocks.23.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 292 |
+
"visual.blocks.23.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 293 |
+
"visual.blocks.23.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 294 |
+
"visual.blocks.23.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 295 |
+
"visual.blocks.24.norm1.weight": "model-00001-of-00004.safetensors",
|
| 296 |
+
"visual.blocks.24.norm2.weight": "model-00001-of-00004.safetensors",
|
| 297 |
+
"visual.blocks.24.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 298 |
+
"visual.blocks.24.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 299 |
+
"visual.blocks.24.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 300 |
+
"visual.blocks.24.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 301 |
+
"visual.blocks.24.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 302 |
+
"visual.blocks.24.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 303 |
+
"visual.blocks.24.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 304 |
+
"visual.blocks.24.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 305 |
+
"visual.blocks.24.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 306 |
+
"visual.blocks.24.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 307 |
+
"visual.blocks.25.norm1.weight": "model-00001-of-00004.safetensors",
|
| 308 |
+
"visual.blocks.25.norm2.weight": "model-00001-of-00004.safetensors",
|
| 309 |
+
"visual.blocks.25.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 310 |
+
"visual.blocks.25.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 311 |
+
"visual.blocks.25.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 312 |
+
"visual.blocks.25.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 313 |
+
"visual.blocks.25.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 314 |
+
"visual.blocks.25.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 315 |
+
"visual.blocks.25.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 316 |
+
"visual.blocks.25.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 317 |
+
"visual.blocks.25.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 318 |
+
"visual.blocks.25.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 319 |
+
"visual.blocks.26.norm1.weight": "model-00001-of-00004.safetensors",
|
| 320 |
+
"visual.blocks.26.norm2.weight": "model-00001-of-00004.safetensors",
|
| 321 |
+
"visual.blocks.26.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 322 |
+
"visual.blocks.26.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 323 |
+
"visual.blocks.26.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 324 |
+
"visual.blocks.26.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 325 |
+
"visual.blocks.26.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 326 |
+
"visual.blocks.26.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 327 |
+
"visual.blocks.26.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 328 |
+
"visual.blocks.26.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 329 |
+
"visual.blocks.26.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 330 |
+
"visual.blocks.26.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 331 |
+
"visual.blocks.27.norm1.weight": "model-00001-of-00004.safetensors",
|
| 332 |
+
"visual.blocks.27.norm2.weight": "model-00001-of-00004.safetensors",
|
| 333 |
+
"visual.blocks.27.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 334 |
+
"visual.blocks.27.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 335 |
+
"visual.blocks.27.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 336 |
+
"visual.blocks.27.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 337 |
+
"visual.blocks.27.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 338 |
+
"visual.blocks.27.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 339 |
+
"visual.blocks.27.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 340 |
+
"visual.blocks.27.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 341 |
+
"visual.blocks.27.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 342 |
+
"visual.blocks.27.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 343 |
+
"visual.blocks.28.norm1.weight": "model-00001-of-00004.safetensors",
|
| 344 |
+
"visual.blocks.28.norm2.weight": "model-00001-of-00004.safetensors",
|
| 345 |
+
"visual.blocks.28.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 346 |
+
"visual.blocks.28.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 347 |
+
"visual.blocks.28.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 348 |
+
"visual.blocks.28.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 349 |
+
"visual.blocks.28.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 350 |
+
"visual.blocks.28.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 351 |
+
"visual.blocks.28.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 352 |
+
"visual.blocks.28.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 353 |
+
"visual.blocks.28.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 354 |
+
"visual.blocks.28.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 355 |
+
"visual.blocks.29.norm1.weight": "model-00001-of-00004.safetensors",
|
| 356 |
+
"visual.blocks.29.norm2.weight": "model-00001-of-00004.safetensors",
|
| 357 |
+
"visual.blocks.29.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 358 |
+
"visual.blocks.29.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 359 |
+
"visual.blocks.29.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 360 |
+
"visual.blocks.29.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 361 |
+
"visual.blocks.29.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 362 |
+
"visual.blocks.29.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 363 |
+
"visual.blocks.29.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 364 |
+
"visual.blocks.29.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 365 |
+
"visual.blocks.29.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 366 |
+
"visual.blocks.29.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 367 |
+
"visual.blocks.30.norm1.weight": "model-00001-of-00004.safetensors",
|
| 368 |
+
"visual.blocks.30.norm2.weight": "model-00001-of-00004.safetensors",
|
| 369 |
+
"visual.blocks.30.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 370 |
+
"visual.blocks.30.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 371 |
+
"visual.blocks.30.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 372 |
+
"visual.blocks.30.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 373 |
+
"visual.blocks.30.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 374 |
+
"visual.blocks.30.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 375 |
+
"visual.blocks.30.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 376 |
+
"visual.blocks.30.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 377 |
+
"visual.blocks.30.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 378 |
+
"visual.blocks.30.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 379 |
+
"visual.blocks.31.norm1.weight": "model-00001-of-00004.safetensors",
|
| 380 |
+
"visual.blocks.31.norm2.weight": "model-00001-of-00004.safetensors",
|
| 381 |
+
"visual.blocks.31.attn.qkv.weight": "model-00001-of-00004.safetensors",
|
| 382 |
+
"visual.blocks.31.attn.qkv.bias": "model-00001-of-00004.safetensors",
|
| 383 |
+
"visual.blocks.31.attn.proj.weight": "model-00001-of-00004.safetensors",
|
| 384 |
+
"visual.blocks.31.attn.proj.bias": "model-00001-of-00004.safetensors",
|
| 385 |
+
"visual.blocks.31.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 386 |
+
"visual.blocks.31.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
|
| 387 |
+
"visual.blocks.31.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 388 |
+
"visual.blocks.31.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
|
| 389 |
+
"visual.blocks.31.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 390 |
+
"visual.blocks.31.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
|
| 391 |
+
"visual.merger.ln_q.weight": "model-00001-of-00004.safetensors",
|
| 392 |
+
"visual.merger.mlp.0.weight": "model-00001-of-00004.safetensors",
|
| 393 |
+
"visual.merger.mlp.0.bias": "model-00001-of-00004.safetensors",
|
| 394 |
+
"visual.merger.mlp.2.weight": "model-00001-of-00004.safetensors",
|
| 395 |
+
"visual.merger.mlp.2.bias": "model-00001-of-00004.safetensors",
|
| 396 |
+
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
|
| 397 |
+
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
| 398 |
+
"model.layers.0.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
| 399 |
+
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
| 400 |
+
"model.layers.0.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
| 401 |
+
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
| 402 |
+
"model.layers.0.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
| 403 |
+
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
| 404 |
+
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 405 |
+
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 406 |
+
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 407 |
+
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 408 |
+
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 409 |
+
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
| 410 |
+
"model.layers.1.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
| 411 |
+
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
| 412 |
+
"model.layers.1.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
| 413 |
+
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
| 414 |
+
"model.layers.1.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
| 415 |
+
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
| 416 |
+
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 417 |
+
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 418 |
+
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 419 |
+
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 420 |
+
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 421 |
+
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
| 422 |
+
"model.layers.2.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
| 423 |
+
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
| 424 |
+
"model.layers.2.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
| 425 |
+
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
| 426 |
+
"model.layers.2.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
| 427 |
+
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
| 428 |
+
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 429 |
+
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 430 |
+
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 431 |
+
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 432 |
+
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 433 |
+
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
| 434 |
+
"model.layers.3.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
| 435 |
+
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
| 436 |
+
"model.layers.3.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
| 437 |
+
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
| 438 |
+
"model.layers.3.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
| 439 |
+
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
| 440 |
+
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 441 |
+
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 442 |
+
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 443 |
+
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 444 |
+
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 445 |
+
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
| 446 |
+
"model.layers.4.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
| 447 |
+
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
| 448 |
+
"model.layers.4.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
| 449 |
+
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
| 450 |
+
"model.layers.4.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
| 451 |
+
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
| 452 |
+
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 453 |
+
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
| 454 |
+
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
| 455 |
+
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 456 |
+
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
| 457 |
+
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
| 458 |
+
"model.layers.5.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
| 459 |
+
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
| 460 |
+
"model.layers.5.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
| 461 |
+
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
| 462 |
+
"model.layers.5.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
| 463 |
+
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
| 464 |
+
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
| 465 |
+
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 466 |
+
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 467 |
+
"model.layers.5.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 468 |
+
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 469 |
+
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 470 |
+
"model.layers.6.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 471 |
+
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 472 |
+
"model.layers.6.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 473 |
+
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 474 |
+
"model.layers.6.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 475 |
+
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 476 |
+
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 477 |
+
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 478 |
+
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 479 |
+
"model.layers.6.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 480 |
+
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 481 |
+
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 482 |
+
"model.layers.7.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 483 |
+
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 484 |
+
"model.layers.7.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 485 |
+
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 486 |
+
"model.layers.7.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 487 |
+
"model.layers.7.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 488 |
+
"model.layers.7.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 489 |
+
"model.layers.7.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 490 |
+
"model.layers.7.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 491 |
+
"model.layers.7.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 492 |
+
"model.layers.7.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 493 |
+
"model.layers.8.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 494 |
+
"model.layers.8.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 495 |
+
"model.layers.8.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 496 |
+
"model.layers.8.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 497 |
+
"model.layers.8.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 498 |
+
"model.layers.8.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 499 |
+
"model.layers.8.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 500 |
+
"model.layers.8.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 501 |
+
"model.layers.8.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 502 |
+
"model.layers.8.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 503 |
+
"model.layers.8.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 504 |
+
"model.layers.8.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 505 |
+
"model.layers.9.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 506 |
+
"model.layers.9.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 507 |
+
"model.layers.9.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 508 |
+
"model.layers.9.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 509 |
+
"model.layers.9.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 510 |
+
"model.layers.9.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 511 |
+
"model.layers.9.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 512 |
+
"model.layers.9.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 513 |
+
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 514 |
+
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 515 |
+
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 516 |
+
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 517 |
+
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 518 |
+
"model.layers.10.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 519 |
+
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 520 |
+
"model.layers.10.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 521 |
+
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 522 |
+
"model.layers.10.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 523 |
+
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 524 |
+
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 525 |
+
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 526 |
+
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 527 |
+
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 528 |
+
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 529 |
+
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 530 |
+
"model.layers.11.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 531 |
+
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 532 |
+
"model.layers.11.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 533 |
+
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 534 |
+
"model.layers.11.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 535 |
+
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 536 |
+
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 537 |
+
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 538 |
+
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 539 |
+
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 540 |
+
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 541 |
+
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 542 |
+
"model.layers.12.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 543 |
+
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 544 |
+
"model.layers.12.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 545 |
+
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 546 |
+
"model.layers.12.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 547 |
+
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 548 |
+
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 549 |
+
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 550 |
+
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 551 |
+
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 552 |
+
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 553 |
+
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 554 |
+
"model.layers.13.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 555 |
+
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 556 |
+
"model.layers.13.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 557 |
+
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 558 |
+
"model.layers.13.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 559 |
+
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 560 |
+
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 561 |
+
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 562 |
+
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 563 |
+
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 564 |
+
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 565 |
+
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 566 |
+
"model.layers.14.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 567 |
+
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 568 |
+
"model.layers.14.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 569 |
+
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 570 |
+
"model.layers.14.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 571 |
+
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 572 |
+
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 573 |
+
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 574 |
+
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 575 |
+
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 576 |
+
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 577 |
+
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 578 |
+
"model.layers.15.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 579 |
+
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 580 |
+
"model.layers.15.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 581 |
+
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 582 |
+
"model.layers.15.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 583 |
+
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 584 |
+
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
| 585 |
+
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
| 586 |
+
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
| 587 |
+
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 588 |
+
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
| 589 |
+
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
| 590 |
+
"model.layers.16.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
| 591 |
+
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
| 592 |
+
"model.layers.16.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
| 593 |
+
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
| 594 |
+
"model.layers.16.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
| 595 |
+
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
| 596 |
+
"model.layers.16.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 597 |
+
"model.layers.16.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 598 |
+
"model.layers.16.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 599 |
+
"model.layers.16.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 600 |
+
"model.layers.16.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 601 |
+
"model.layers.17.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 602 |
+
"model.layers.17.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 603 |
+
"model.layers.17.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 604 |
+
"model.layers.17.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 605 |
+
"model.layers.17.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 606 |
+
"model.layers.17.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 607 |
+
"model.layers.17.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 608 |
+
"model.layers.17.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 609 |
+
"model.layers.17.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 610 |
+
"model.layers.17.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 611 |
+
"model.layers.17.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 612 |
+
"model.layers.17.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 613 |
+
"model.layers.18.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 614 |
+
"model.layers.18.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 615 |
+
"model.layers.18.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 616 |
+
"model.layers.18.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 617 |
+
"model.layers.18.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 618 |
+
"model.layers.18.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 619 |
+
"model.layers.18.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 620 |
+
"model.layers.18.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 621 |
+
"model.layers.18.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 622 |
+
"model.layers.18.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 623 |
+
"model.layers.18.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 624 |
+
"model.layers.18.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 625 |
+
"model.layers.19.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 626 |
+
"model.layers.19.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 627 |
+
"model.layers.19.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 628 |
+
"model.layers.19.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 629 |
+
"model.layers.19.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 630 |
+
"model.layers.19.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 631 |
+
"model.layers.19.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 632 |
+
"model.layers.19.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 633 |
+
"model.layers.19.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 634 |
+
"model.layers.19.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 635 |
+
"model.layers.19.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 636 |
+
"model.layers.19.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 637 |
+
"model.layers.20.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 638 |
+
"model.layers.20.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 639 |
+
"model.layers.20.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 640 |
+
"model.layers.20.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 641 |
+
"model.layers.20.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 642 |
+
"model.layers.20.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 643 |
+
"model.layers.20.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 644 |
+
"model.layers.20.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 645 |
+
"model.layers.20.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 646 |
+
"model.layers.20.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 647 |
+
"model.layers.20.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 648 |
+
"model.layers.20.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 649 |
+
"model.layers.21.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 650 |
+
"model.layers.21.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 651 |
+
"model.layers.21.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 652 |
+
"model.layers.21.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 653 |
+
"model.layers.21.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 654 |
+
"model.layers.21.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 655 |
+
"model.layers.21.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 656 |
+
"model.layers.21.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 657 |
+
"model.layers.21.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 658 |
+
"model.layers.21.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 659 |
+
"model.layers.21.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 660 |
+
"model.layers.21.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 661 |
+
"model.layers.22.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 662 |
+
"model.layers.22.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 663 |
+
"model.layers.22.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 664 |
+
"model.layers.22.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 665 |
+
"model.layers.22.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 666 |
+
"model.layers.22.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 667 |
+
"model.layers.22.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 668 |
+
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 669 |
+
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 670 |
+
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 671 |
+
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 672 |
+
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 673 |
+
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 674 |
+
"model.layers.23.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 675 |
+
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 676 |
+
"model.layers.23.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 677 |
+
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 678 |
+
"model.layers.23.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 679 |
+
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 680 |
+
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 681 |
+
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 682 |
+
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 683 |
+
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 684 |
+
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 685 |
+
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 686 |
+
"model.layers.24.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 687 |
+
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 688 |
+
"model.layers.24.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 689 |
+
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 690 |
+
"model.layers.24.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 691 |
+
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 692 |
+
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 693 |
+
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 694 |
+
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 695 |
+
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 696 |
+
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 697 |
+
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 698 |
+
"model.layers.25.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 699 |
+
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 700 |
+
"model.layers.25.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 701 |
+
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 702 |
+
"model.layers.25.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 703 |
+
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 704 |
+
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 705 |
+
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 706 |
+
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
| 707 |
+
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 708 |
+
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
| 709 |
+
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
| 710 |
+
"model.layers.26.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
| 711 |
+
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
| 712 |
+
"model.layers.26.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
| 713 |
+
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
| 714 |
+
"model.layers.26.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
| 715 |
+
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
| 716 |
+
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
| 717 |
+
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
| 718 |
+
"model.layers.26.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
|
| 719 |
+
"model.layers.26.input_layernorm.weight": "model-00004-of-00004.safetensors",
|
| 720 |
+
"model.layers.26.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
|
| 721 |
+
"model.layers.27.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
|
| 722 |
+
"model.layers.27.self_attn.q_proj.bias": "model-00004-of-00004.safetensors",
|
| 723 |
+
"model.layers.27.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
|
| 724 |
+
"model.layers.27.self_attn.k_proj.bias": "model-00004-of-00004.safetensors",
|
| 725 |
+
"model.layers.27.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
|
| 726 |
+
"model.layers.27.self_attn.v_proj.bias": "model-00004-of-00004.safetensors",
|
| 727 |
+
"model.layers.27.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
|
| 728 |
+
"model.layers.27.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
|
| 729 |
+
"model.layers.27.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
|
| 730 |
+
"model.layers.27.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
|
| 731 |
+
"model.layers.27.input_layernorm.weight": "model-00004-of-00004.safetensors",
|
| 732 |
+
"model.layers.27.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
|
| 733 |
+
"model.norm.weight": "model-00004-of-00004.safetensors",
|
| 734 |
+
"lm_head.weight": "model-00004-of-00004.safetensors"
|
| 735 |
+
}
|
| 736 |
+
}
|
text_encoder/preprocessor_config.json
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"min_pixels": 3136,
|
| 3 |
+
"max_pixels": 12845056,
|
| 4 |
+
"patch_size": 14,
|
| 5 |
+
"temporal_patch_size": 2,
|
| 6 |
+
"merge_size": 2,
|
| 7 |
+
"image_mean": [
|
| 8 |
+
0.48145466,
|
| 9 |
+
0.4578275,
|
| 10 |
+
0.40821073
|
| 11 |
+
],
|
| 12 |
+
"image_std": [
|
| 13 |
+
0.26862954,
|
| 14 |
+
0.26130258,
|
| 15 |
+
0.27577711
|
| 16 |
+
],
|
| 17 |
+
"image_processor_type": "Qwen2VLImageProcessor",
|
| 18 |
+
"processor_class": "Qwen2_5_VLProcessor"
|
| 19 |
+
}
|
text_encoder/tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
text_encoder/tokenizer_config.json
ADDED
|
@@ -0,0 +1,207 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_prefix_space": false,
|
| 3 |
+
"added_tokens_decoder": {
|
| 4 |
+
"151643": {
|
| 5 |
+
"content": "<|endoftext|>",
|
| 6 |
+
"lstrip": false,
|
| 7 |
+
"normalized": false,
|
| 8 |
+
"rstrip": false,
|
| 9 |
+
"single_word": false,
|
| 10 |
+
"special": true
|
| 11 |
+
},
|
| 12 |
+
"151644": {
|
| 13 |
+
"content": "<|im_start|>",
|
| 14 |
+
"lstrip": false,
|
| 15 |
+
"normalized": false,
|
| 16 |
+
"rstrip": false,
|
| 17 |
+
"single_word": false,
|
| 18 |
+
"special": true
|
| 19 |
+
},
|
| 20 |
+
"151645": {
|
| 21 |
+
"content": "<|im_end|>",
|
| 22 |
+
"lstrip": false,
|
| 23 |
+
"normalized": false,
|
| 24 |
+
"rstrip": false,
|
| 25 |
+
"single_word": false,
|
| 26 |
+
"special": true
|
| 27 |
+
},
|
| 28 |
+
"151646": {
|
| 29 |
+
"content": "<|object_ref_start|>",
|
| 30 |
+
"lstrip": false,
|
| 31 |
+
"normalized": false,
|
| 32 |
+
"rstrip": false,
|
| 33 |
+
"single_word": false,
|
| 34 |
+
"special": true
|
| 35 |
+
},
|
| 36 |
+
"151647": {
|
| 37 |
+
"content": "<|object_ref_end|>",
|
| 38 |
+
"lstrip": false,
|
| 39 |
+
"normalized": false,
|
| 40 |
+
"rstrip": false,
|
| 41 |
+
"single_word": false,
|
| 42 |
+
"special": true
|
| 43 |
+
},
|
| 44 |
+
"151648": {
|
| 45 |
+
"content": "<|box_start|>",
|
| 46 |
+
"lstrip": false,
|
| 47 |
+
"normalized": false,
|
| 48 |
+
"rstrip": false,
|
| 49 |
+
"single_word": false,
|
| 50 |
+
"special": true
|
| 51 |
+
},
|
| 52 |
+
"151649": {
|
| 53 |
+
"content": "<|box_end|>",
|
| 54 |
+
"lstrip": false,
|
| 55 |
+
"normalized": false,
|
| 56 |
+
"rstrip": false,
|
| 57 |
+
"single_word": false,
|
| 58 |
+
"special": true
|
| 59 |
+
},
|
| 60 |
+
"151650": {
|
| 61 |
+
"content": "<|quad_start|>",
|
| 62 |
+
"lstrip": false,
|
| 63 |
+
"normalized": false,
|
| 64 |
+
"rstrip": false,
|
| 65 |
+
"single_word": false,
|
| 66 |
+
"special": true
|
| 67 |
+
},
|
| 68 |
+
"151651": {
|
| 69 |
+
"content": "<|quad_end|>",
|
| 70 |
+
"lstrip": false,
|
| 71 |
+
"normalized": false,
|
| 72 |
+
"rstrip": false,
|
| 73 |
+
"single_word": false,
|
| 74 |
+
"special": true
|
| 75 |
+
},
|
| 76 |
+
"151652": {
|
| 77 |
+
"content": "<|vision_start|>",
|
| 78 |
+
"lstrip": false,
|
| 79 |
+
"normalized": false,
|
| 80 |
+
"rstrip": false,
|
| 81 |
+
"single_word": false,
|
| 82 |
+
"special": true
|
| 83 |
+
},
|
| 84 |
+
"151653": {
|
| 85 |
+
"content": "<|vision_end|>",
|
| 86 |
+
"lstrip": false,
|
| 87 |
+
"normalized": false,
|
| 88 |
+
"rstrip": false,
|
| 89 |
+
"single_word": false,
|
| 90 |
+
"special": true
|
| 91 |
+
},
|
| 92 |
+
"151654": {
|
| 93 |
+
"content": "<|vision_pad|>",
|
| 94 |
+
"lstrip": false,
|
| 95 |
+
"normalized": false,
|
| 96 |
+
"rstrip": false,
|
| 97 |
+
"single_word": false,
|
| 98 |
+
"special": true
|
| 99 |
+
},
|
| 100 |
+
"151655": {
|
| 101 |
+
"content": "<|image_pad|>",
|
| 102 |
+
"lstrip": false,
|
| 103 |
+
"normalized": false,
|
| 104 |
+
"rstrip": false,
|
| 105 |
+
"single_word": false,
|
| 106 |
+
"special": true
|
| 107 |
+
},
|
| 108 |
+
"151656": {
|
| 109 |
+
"content": "<|video_pad|>",
|
| 110 |
+
"lstrip": false,
|
| 111 |
+
"normalized": false,
|
| 112 |
+
"rstrip": false,
|
| 113 |
+
"single_word": false,
|
| 114 |
+
"special": true
|
| 115 |
+
},
|
| 116 |
+
"151657": {
|
| 117 |
+
"content": "<tool_call>",
|
| 118 |
+
"lstrip": false,
|
| 119 |
+
"normalized": false,
|
| 120 |
+
"rstrip": false,
|
| 121 |
+
"single_word": false,
|
| 122 |
+
"special": false
|
| 123 |
+
},
|
| 124 |
+
"151658": {
|
| 125 |
+
"content": "</tool_call>",
|
| 126 |
+
"lstrip": false,
|
| 127 |
+
"normalized": false,
|
| 128 |
+
"rstrip": false,
|
| 129 |
+
"single_word": false,
|
| 130 |
+
"special": false
|
| 131 |
+
},
|
| 132 |
+
"151659": {
|
| 133 |
+
"content": "<|fim_prefix|>",
|
| 134 |
+
"lstrip": false,
|
| 135 |
+
"normalized": false,
|
| 136 |
+
"rstrip": false,
|
| 137 |
+
"single_word": false,
|
| 138 |
+
"special": false
|
| 139 |
+
},
|
| 140 |
+
"151660": {
|
| 141 |
+
"content": "<|fim_middle|>",
|
| 142 |
+
"lstrip": false,
|
| 143 |
+
"normalized": false,
|
| 144 |
+
"rstrip": false,
|
| 145 |
+
"single_word": false,
|
| 146 |
+
"special": false
|
| 147 |
+
},
|
| 148 |
+
"151661": {
|
| 149 |
+
"content": "<|fim_suffix|>",
|
| 150 |
+
"lstrip": false,
|
| 151 |
+
"normalized": false,
|
| 152 |
+
"rstrip": false,
|
| 153 |
+
"single_word": false,
|
| 154 |
+
"special": false
|
| 155 |
+
},
|
| 156 |
+
"151662": {
|
| 157 |
+
"content": "<|fim_pad|>",
|
| 158 |
+
"lstrip": false,
|
| 159 |
+
"normalized": false,
|
| 160 |
+
"rstrip": false,
|
| 161 |
+
"single_word": false,
|
| 162 |
+
"special": false
|
| 163 |
+
},
|
| 164 |
+
"151663": {
|
| 165 |
+
"content": "<|repo_name|>",
|
| 166 |
+
"lstrip": false,
|
| 167 |
+
"normalized": false,
|
| 168 |
+
"rstrip": false,
|
| 169 |
+
"single_word": false,
|
| 170 |
+
"special": false
|
| 171 |
+
},
|
| 172 |
+
"151664": {
|
| 173 |
+
"content": "<|file_sep|>",
|
| 174 |
+
"lstrip": false,
|
| 175 |
+
"normalized": false,
|
| 176 |
+
"rstrip": false,
|
| 177 |
+
"single_word": false,
|
| 178 |
+
"special": false
|
| 179 |
+
}
|
| 180 |
+
},
|
| 181 |
+
"additional_special_tokens": [
|
| 182 |
+
"<|im_start|>",
|
| 183 |
+
"<|im_end|>",
|
| 184 |
+
"<|object_ref_start|>",
|
| 185 |
+
"<|object_ref_end|>",
|
| 186 |
+
"<|box_start|>",
|
| 187 |
+
"<|box_end|>",
|
| 188 |
+
"<|quad_start|>",
|
| 189 |
+
"<|quad_end|>",
|
| 190 |
+
"<|vision_start|>",
|
| 191 |
+
"<|vision_end|>",
|
| 192 |
+
"<|vision_pad|>",
|
| 193 |
+
"<|image_pad|>",
|
| 194 |
+
"<|video_pad|>"
|
| 195 |
+
],
|
| 196 |
+
"bos_token": null,
|
| 197 |
+
"chat_template": "{% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n{% endif %}<|im_start|>{{ message['role'] }}\n{% if message['content'] is string %}{{ message['content'] }}<|im_end|>\n{% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|>\n{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}",
|
| 198 |
+
"clean_up_tokenization_spaces": false,
|
| 199 |
+
"eos_token": "<|im_end|>",
|
| 200 |
+
"errors": "replace",
|
| 201 |
+
"model_max_length": 131072,
|
| 202 |
+
"pad_token": "<|endoftext|>",
|
| 203 |
+
"split_special_tokens": false,
|
| 204 |
+
"tokenizer_class": "Qwen2Tokenizer",
|
| 205 |
+
"unk_token": null,
|
| 206 |
+
"add_bos_token": false
|
| 207 |
+
}
|
transformer/model-00001-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b7432669646c68871d54a475e49c127cecfb27f4bfd55b4c8fe53e1f40c18976
|
| 3 |
+
size 1993039688
|
transformer/model-00002-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9bc1fcb536aa921d7d83078080459d95acd242f6d1ff813099a8509395545916
|
| 3 |
+
size 1908416128
|
transformer/model-00003-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b376154e1d8afe82ebe661d0368953472f287d465d7206bbb1222b5929957807
|
| 3 |
+
size 1971331304
|
transformer/model-00004-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4ffbf5e0e7a6bce8b77a95eb57d7425e77e8a230a82129eaabbe89082f492d40
|
| 3 |
+
size 1960845048
|
transformer/model-00005-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:612cd03a53f2f21ab611f5a19672377c159b0d4bd976e55063749348323448d3
|
| 3 |
+
size 1971331328
|
transformer/model-00006-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f79eb7a489c5bb8a8261a4c422099c91e4685294da944fe43d2d4ad968f41cf2
|
| 3 |
+
size 1960845096
|
transformer/model-00007-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4e24afef14ca2a08ad4c4f99a7e42dc9dbfbc6f0f6ca86b86a7497575cffc708
|
| 3 |
+
size 1971331360
|
transformer/model-00008-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8adb489c255c38f6fd40b6f349dd38e6646db29fce513ed7b679b089f8bdf427
|
| 3 |
+
size 1960845096
|
transformer/model-00009-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3a027cbbbb13471efee51f190f4ba6d0379f233515aac26aaeb0783a31f7a62d
|
| 3 |
+
size 1971331360
|
transformer/model-00010-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:12e4f717894e9d6fa7384b6228fea6e2bc3d425609a30fe84c3229b45a96588e
|
| 3 |
+
size 1960845096
|
transformer/model-00011-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:30fc8d64fb84c15ebe6266cd411dd695ea84058d35801d0263ebb4e17f37d206
|
| 3 |
+
size 1971331360
|
transformer/model-00012-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c4b1adc8233ab848a6d3c151a6a05aa82f3429885217dbcc70cfa72e0e9351d4
|
| 3 |
+
size 1960845096
|
transformer/model-00013-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fc3fca0bf2f0161e90aa5d046f58f0a6c857e38c1cdf49508b34f2c5c21c7c6a
|
| 3 |
+
size 1971331360
|
transformer/model-00014-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:855247fd083449e444ec360fbfcb66c91934df9a39ea7d1f0aa4e915affedf76
|
| 3 |
+
size 1960845096
|
transformer/model-00015-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ed1e93be3b15a1104a9e743bb14a4fb3fb4b17210d74832aec23618f96b49c15
|
| 3 |
+
size 1371776064
|
transformer/model-00016-of-00016.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fdf7d7ec95be6c4195973dd847398fd8b1762f50f6d338040593504fbc9d2d45
|
| 3 |
+
size 629176560
|
transformer/model.safetensors.index.json
ADDED
|
@@ -0,0 +1,742 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"metadata": {
|
| 3 |
+
"total_size": 29495384064
|
| 4 |
+
},
|
| 5 |
+
"weight_map": {
|
| 6 |
+
"x_embedder.proj.1.weight": "model-00001-of-00016.safetensors",
|
| 7 |
+
"t_embedder.1.linear_1.weight": "model-00001-of-00016.safetensors",
|
| 8 |
+
"t_embedder.1.linear_2.weight": "model-00001-of-00016.safetensors",
|
| 9 |
+
"blocks.0.self_attn.q_proj.weight": "model-00001-of-00016.safetensors",
|
| 10 |
+
"blocks.0.self_attn.q_norm.weight": "model-00001-of-00016.safetensors",
|
| 11 |
+
"blocks.0.self_attn.k_proj.weight": "model-00001-of-00016.safetensors",
|
| 12 |
+
"blocks.0.self_attn.k_norm.weight": "model-00001-of-00016.safetensors",
|
| 13 |
+
"blocks.0.self_attn.v_proj.weight": "model-00001-of-00016.safetensors",
|
| 14 |
+
"blocks.0.self_attn.output_proj.weight": "model-00001-of-00016.safetensors",
|
| 15 |
+
"blocks.0.cross_attn.q_proj.weight": "model-00001-of-00016.safetensors",
|
| 16 |
+
"blocks.0.cross_attn.q_norm.weight": "model-00001-of-00016.safetensors",
|
| 17 |
+
"blocks.0.cross_attn.k_proj.weight": "model-00001-of-00016.safetensors",
|
| 18 |
+
"blocks.0.cross_attn.k_norm.weight": "model-00001-of-00016.safetensors",
|
| 19 |
+
"blocks.0.cross_attn.v_proj.weight": "model-00001-of-00016.safetensors",
|
| 20 |
+
"blocks.0.cross_attn.output_proj.weight": "model-00001-of-00016.safetensors",
|
| 21 |
+
"blocks.0.mlp.layer1.weight": "model-00001-of-00016.safetensors",
|
| 22 |
+
"blocks.0.mlp.layer2.weight": "model-00001-of-00016.safetensors",
|
| 23 |
+
"blocks.0.adaln_modulation_self_attn.1.weight": "model-00001-of-00016.safetensors",
|
| 24 |
+
"blocks.0.adaln_modulation_self_attn.2.weight": "model-00001-of-00016.safetensors",
|
| 25 |
+
"blocks.0.adaln_modulation_cross_attn.1.weight": "model-00001-of-00016.safetensors",
|
| 26 |
+
"blocks.0.adaln_modulation_cross_attn.2.weight": "model-00001-of-00016.safetensors",
|
| 27 |
+
"blocks.0.adaln_modulation_mlp.1.weight": "model-00001-of-00016.safetensors",
|
| 28 |
+
"blocks.0.adaln_modulation_mlp.2.weight": "model-00001-of-00016.safetensors",
|
| 29 |
+
"blocks.1.self_attn.q_proj.weight": "model-00001-of-00016.safetensors",
|
| 30 |
+
"blocks.1.self_attn.q_norm.weight": "model-00001-of-00016.safetensors",
|
| 31 |
+
"blocks.1.self_attn.k_proj.weight": "model-00001-of-00016.safetensors",
|
| 32 |
+
"blocks.1.self_attn.k_norm.weight": "model-00001-of-00016.safetensors",
|
| 33 |
+
"blocks.1.self_attn.v_proj.weight": "model-00001-of-00016.safetensors",
|
| 34 |
+
"blocks.1.self_attn.output_proj.weight": "model-00001-of-00016.safetensors",
|
| 35 |
+
"blocks.1.cross_attn.q_proj.weight": "model-00001-of-00016.safetensors",
|
| 36 |
+
"blocks.1.cross_attn.q_norm.weight": "model-00001-of-00016.safetensors",
|
| 37 |
+
"blocks.1.cross_attn.k_proj.weight": "model-00001-of-00016.safetensors",
|
| 38 |
+
"blocks.1.cross_attn.k_norm.weight": "model-00001-of-00016.safetensors",
|
| 39 |
+
"blocks.1.cross_attn.v_proj.weight": "model-00001-of-00016.safetensors",
|
| 40 |
+
"blocks.1.cross_attn.output_proj.weight": "model-00001-of-00016.safetensors",
|
| 41 |
+
"blocks.1.mlp.layer1.weight": "model-00001-of-00016.safetensors",
|
| 42 |
+
"blocks.1.mlp.layer2.weight": "model-00001-of-00016.safetensors",
|
| 43 |
+
"blocks.1.adaln_modulation_self_attn.1.weight": "model-00001-of-00016.safetensors",
|
| 44 |
+
"blocks.1.adaln_modulation_self_attn.2.weight": "model-00001-of-00016.safetensors",
|
| 45 |
+
"blocks.1.adaln_modulation_cross_attn.1.weight": "model-00001-of-00016.safetensors",
|
| 46 |
+
"blocks.1.adaln_modulation_cross_attn.2.weight": "model-00001-of-00016.safetensors",
|
| 47 |
+
"blocks.1.adaln_modulation_mlp.1.weight": "model-00001-of-00016.safetensors",
|
| 48 |
+
"blocks.1.adaln_modulation_mlp.2.weight": "model-00001-of-00016.safetensors",
|
| 49 |
+
"blocks.2.self_attn.q_proj.weight": "model-00001-of-00016.safetensors",
|
| 50 |
+
"blocks.2.self_attn.q_norm.weight": "model-00001-of-00016.safetensors",
|
| 51 |
+
"blocks.2.self_attn.k_proj.weight": "model-00001-of-00016.safetensors",
|
| 52 |
+
"blocks.2.self_attn.k_norm.weight": "model-00001-of-00016.safetensors",
|
| 53 |
+
"blocks.2.self_attn.v_proj.weight": "model-00001-of-00016.safetensors",
|
| 54 |
+
"blocks.2.self_attn.output_proj.weight": "model-00001-of-00016.safetensors",
|
| 55 |
+
"blocks.2.cross_attn.q_proj.weight": "model-00002-of-00016.safetensors",
|
| 56 |
+
"blocks.2.cross_attn.q_norm.weight": "model-00002-of-00016.safetensors",
|
| 57 |
+
"blocks.2.cross_attn.k_proj.weight": "model-00002-of-00016.safetensors",
|
| 58 |
+
"blocks.2.cross_attn.k_norm.weight": "model-00002-of-00016.safetensors",
|
| 59 |
+
"blocks.2.cross_attn.v_proj.weight": "model-00002-of-00016.safetensors",
|
| 60 |
+
"blocks.2.cross_attn.output_proj.weight": "model-00002-of-00016.safetensors",
|
| 61 |
+
"blocks.2.mlp.layer1.weight": "model-00002-of-00016.safetensors",
|
| 62 |
+
"blocks.2.mlp.layer2.weight": "model-00002-of-00016.safetensors",
|
| 63 |
+
"blocks.2.adaln_modulation_self_attn.1.weight": "model-00002-of-00016.safetensors",
|
| 64 |
+
"blocks.2.adaln_modulation_self_attn.2.weight": "model-00002-of-00016.safetensors",
|
| 65 |
+
"blocks.2.adaln_modulation_cross_attn.1.weight": "model-00002-of-00016.safetensors",
|
| 66 |
+
"blocks.2.adaln_modulation_cross_attn.2.weight": "model-00002-of-00016.safetensors",
|
| 67 |
+
"blocks.2.adaln_modulation_mlp.1.weight": "model-00002-of-00016.safetensors",
|
| 68 |
+
"blocks.2.adaln_modulation_mlp.2.weight": "model-00002-of-00016.safetensors",
|
| 69 |
+
"blocks.3.self_attn.q_proj.weight": "model-00002-of-00016.safetensors",
|
| 70 |
+
"blocks.3.self_attn.q_norm.weight": "model-00002-of-00016.safetensors",
|
| 71 |
+
"blocks.3.self_attn.k_proj.weight": "model-00002-of-00016.safetensors",
|
| 72 |
+
"blocks.3.self_attn.k_norm.weight": "model-00002-of-00016.safetensors",
|
| 73 |
+
"blocks.3.self_attn.v_proj.weight": "model-00002-of-00016.safetensors",
|
| 74 |
+
"blocks.3.self_attn.output_proj.weight": "model-00002-of-00016.safetensors",
|
| 75 |
+
"blocks.3.cross_attn.q_proj.weight": "model-00002-of-00016.safetensors",
|
| 76 |
+
"blocks.3.cross_attn.q_norm.weight": "model-00002-of-00016.safetensors",
|
| 77 |
+
"blocks.3.cross_attn.k_proj.weight": "model-00002-of-00016.safetensors",
|
| 78 |
+
"blocks.3.cross_attn.k_norm.weight": "model-00002-of-00016.safetensors",
|
| 79 |
+
"blocks.3.cross_attn.v_proj.weight": "model-00002-of-00016.safetensors",
|
| 80 |
+
"blocks.3.cross_attn.output_proj.weight": "model-00002-of-00016.safetensors",
|
| 81 |
+
"blocks.3.mlp.layer1.weight": "model-00002-of-00016.safetensors",
|
| 82 |
+
"blocks.3.mlp.layer2.weight": "model-00002-of-00016.safetensors",
|
| 83 |
+
"blocks.3.adaln_modulation_self_attn.1.weight": "model-00002-of-00016.safetensors",
|
| 84 |
+
"blocks.3.adaln_modulation_self_attn.2.weight": "model-00002-of-00016.safetensors",
|
| 85 |
+
"blocks.3.adaln_modulation_cross_attn.1.weight": "model-00002-of-00016.safetensors",
|
| 86 |
+
"blocks.3.adaln_modulation_cross_attn.2.weight": "model-00002-of-00016.safetensors",
|
| 87 |
+
"blocks.3.adaln_modulation_mlp.1.weight": "model-00002-of-00016.safetensors",
|
| 88 |
+
"blocks.3.adaln_modulation_mlp.2.weight": "model-00002-of-00016.safetensors",
|
| 89 |
+
"blocks.4.self_attn.q_proj.weight": "model-00002-of-00016.safetensors",
|
| 90 |
+
"blocks.4.self_attn.q_norm.weight": "model-00002-of-00016.safetensors",
|
| 91 |
+
"blocks.4.self_attn.k_proj.weight": "model-00002-of-00016.safetensors",
|
| 92 |
+
"blocks.4.self_attn.k_norm.weight": "model-00002-of-00016.safetensors",
|
| 93 |
+
"blocks.4.self_attn.v_proj.weight": "model-00002-of-00016.safetensors",
|
| 94 |
+
"blocks.4.self_attn.output_proj.weight": "model-00002-of-00016.safetensors",
|
| 95 |
+
"blocks.4.cross_attn.q_proj.weight": "model-00002-of-00016.safetensors",
|
| 96 |
+
"blocks.4.cross_attn.q_norm.weight": "model-00002-of-00016.safetensors",
|
| 97 |
+
"blocks.4.cross_attn.k_proj.weight": "model-00002-of-00016.safetensors",
|
| 98 |
+
"blocks.4.cross_attn.k_norm.weight": "model-00002-of-00016.safetensors",
|
| 99 |
+
"blocks.4.cross_attn.v_proj.weight": "model-00002-of-00016.safetensors",
|
| 100 |
+
"blocks.4.cross_attn.output_proj.weight": "model-00002-of-00016.safetensors",
|
| 101 |
+
"blocks.4.mlp.layer1.weight": "model-00002-of-00016.safetensors",
|
| 102 |
+
"blocks.4.mlp.layer2.weight": "model-00003-of-00016.safetensors",
|
| 103 |
+
"blocks.4.adaln_modulation_self_attn.1.weight": "model-00003-of-00016.safetensors",
|
| 104 |
+
"blocks.4.adaln_modulation_self_attn.2.weight": "model-00003-of-00016.safetensors",
|
| 105 |
+
"blocks.4.adaln_modulation_cross_attn.1.weight": "model-00003-of-00016.safetensors",
|
| 106 |
+
"blocks.4.adaln_modulation_cross_attn.2.weight": "model-00003-of-00016.safetensors",
|
| 107 |
+
"blocks.4.adaln_modulation_mlp.1.weight": "model-00003-of-00016.safetensors",
|
| 108 |
+
"blocks.4.adaln_modulation_mlp.2.weight": "model-00003-of-00016.safetensors",
|
| 109 |
+
"blocks.5.self_attn.q_proj.weight": "model-00003-of-00016.safetensors",
|
| 110 |
+
"blocks.5.self_attn.q_norm.weight": "model-00003-of-00016.safetensors",
|
| 111 |
+
"blocks.5.self_attn.k_proj.weight": "model-00003-of-00016.safetensors",
|
| 112 |
+
"blocks.5.self_attn.k_norm.weight": "model-00003-of-00016.safetensors",
|
| 113 |
+
"blocks.5.self_attn.v_proj.weight": "model-00003-of-00016.safetensors",
|
| 114 |
+
"blocks.5.self_attn.output_proj.weight": "model-00003-of-00016.safetensors",
|
| 115 |
+
"blocks.5.cross_attn.q_proj.weight": "model-00003-of-00016.safetensors",
|
| 116 |
+
"blocks.5.cross_attn.q_norm.weight": "model-00003-of-00016.safetensors",
|
| 117 |
+
"blocks.5.cross_attn.k_proj.weight": "model-00003-of-00016.safetensors",
|
| 118 |
+
"blocks.5.cross_attn.k_norm.weight": "model-00003-of-00016.safetensors",
|
| 119 |
+
"blocks.5.cross_attn.v_proj.weight": "model-00003-of-00016.safetensors",
|
| 120 |
+
"blocks.5.cross_attn.output_proj.weight": "model-00003-of-00016.safetensors",
|
| 121 |
+
"blocks.5.mlp.layer1.weight": "model-00003-of-00016.safetensors",
|
| 122 |
+
"blocks.5.mlp.layer2.weight": "model-00003-of-00016.safetensors",
|
| 123 |
+
"blocks.5.adaln_modulation_self_attn.1.weight": "model-00003-of-00016.safetensors",
|
| 124 |
+
"blocks.5.adaln_modulation_self_attn.2.weight": "model-00003-of-00016.safetensors",
|
| 125 |
+
"blocks.5.adaln_modulation_cross_attn.1.weight": "model-00003-of-00016.safetensors",
|
| 126 |
+
"blocks.5.adaln_modulation_cross_attn.2.weight": "model-00003-of-00016.safetensors",
|
| 127 |
+
"blocks.5.adaln_modulation_mlp.1.weight": "model-00003-of-00016.safetensors",
|
| 128 |
+
"blocks.5.adaln_modulation_mlp.2.weight": "model-00003-of-00016.safetensors",
|
| 129 |
+
"blocks.6.self_attn.q_proj.weight": "model-00003-of-00016.safetensors",
|
| 130 |
+
"blocks.6.self_attn.q_norm.weight": "model-00003-of-00016.safetensors",
|
| 131 |
+
"blocks.6.self_attn.k_proj.weight": "model-00003-of-00016.safetensors",
|
| 132 |
+
"blocks.6.self_attn.k_norm.weight": "model-00003-of-00016.safetensors",
|
| 133 |
+
"blocks.6.self_attn.v_proj.weight": "model-00003-of-00016.safetensors",
|
| 134 |
+
"blocks.6.self_attn.output_proj.weight": "model-00003-of-00016.safetensors",
|
| 135 |
+
"blocks.6.cross_attn.q_proj.weight": "model-00003-of-00016.safetensors",
|
| 136 |
+
"blocks.6.cross_attn.q_norm.weight": "model-00003-of-00016.safetensors",
|
| 137 |
+
"blocks.6.cross_attn.k_proj.weight": "model-00003-of-00016.safetensors",
|
| 138 |
+
"blocks.6.cross_attn.k_norm.weight": "model-00003-of-00016.safetensors",
|
| 139 |
+
"blocks.6.cross_attn.v_proj.weight": "model-00003-of-00016.safetensors",
|
| 140 |
+
"blocks.6.cross_attn.output_proj.weight": "model-00003-of-00016.safetensors",
|
| 141 |
+
"blocks.6.mlp.layer1.weight": "model-00003-of-00016.safetensors",
|
| 142 |
+
"blocks.6.mlp.layer2.weight": "model-00003-of-00016.safetensors",
|
| 143 |
+
"blocks.6.adaln_modulation_self_attn.1.weight": "model-00003-of-00016.safetensors",
|
| 144 |
+
"blocks.6.adaln_modulation_self_attn.2.weight": "model-00003-of-00016.safetensors",
|
| 145 |
+
"blocks.6.adaln_modulation_cross_attn.1.weight": "model-00003-of-00016.safetensors",
|
| 146 |
+
"blocks.6.adaln_modulation_cross_attn.2.weight": "model-00003-of-00016.safetensors",
|
| 147 |
+
"blocks.6.adaln_modulation_mlp.1.weight": "model-00003-of-00016.safetensors",
|
| 148 |
+
"blocks.6.adaln_modulation_mlp.2.weight": "model-00003-of-00016.safetensors",
|
| 149 |
+
"blocks.7.self_attn.q_proj.weight": "model-00003-of-00016.safetensors",
|
| 150 |
+
"blocks.7.self_attn.q_norm.weight": "model-00003-of-00016.safetensors",
|
| 151 |
+
"blocks.7.self_attn.k_proj.weight": "model-00003-of-00016.safetensors",
|
| 152 |
+
"blocks.7.self_attn.k_norm.weight": "model-00003-of-00016.safetensors",
|
| 153 |
+
"blocks.7.self_attn.v_proj.weight": "model-00003-of-00016.safetensors",
|
| 154 |
+
"blocks.7.self_attn.output_proj.weight": "model-00004-of-00016.safetensors",
|
| 155 |
+
"blocks.7.cross_attn.q_proj.weight": "model-00004-of-00016.safetensors",
|
| 156 |
+
"blocks.7.cross_attn.q_norm.weight": "model-00004-of-00016.safetensors",
|
| 157 |
+
"blocks.7.cross_attn.k_proj.weight": "model-00004-of-00016.safetensors",
|
| 158 |
+
"blocks.7.cross_attn.k_norm.weight": "model-00004-of-00016.safetensors",
|
| 159 |
+
"blocks.7.cross_attn.v_proj.weight": "model-00004-of-00016.safetensors",
|
| 160 |
+
"blocks.7.cross_attn.output_proj.weight": "model-00004-of-00016.safetensors",
|
| 161 |
+
"blocks.7.mlp.layer1.weight": "model-00004-of-00016.safetensors",
|
| 162 |
+
"blocks.7.mlp.layer2.weight": "model-00004-of-00016.safetensors",
|
| 163 |
+
"blocks.7.adaln_modulation_self_attn.1.weight": "model-00004-of-00016.safetensors",
|
| 164 |
+
"blocks.7.adaln_modulation_self_attn.2.weight": "model-00004-of-00016.safetensors",
|
| 165 |
+
"blocks.7.adaln_modulation_cross_attn.1.weight": "model-00004-of-00016.safetensors",
|
| 166 |
+
"blocks.7.adaln_modulation_cross_attn.2.weight": "model-00004-of-00016.safetensors",
|
| 167 |
+
"blocks.7.adaln_modulation_mlp.1.weight": "model-00004-of-00016.safetensors",
|
| 168 |
+
"blocks.7.adaln_modulation_mlp.2.weight": "model-00004-of-00016.safetensors",
|
| 169 |
+
"blocks.8.self_attn.q_proj.weight": "model-00004-of-00016.safetensors",
|
| 170 |
+
"blocks.8.self_attn.q_norm.weight": "model-00004-of-00016.safetensors",
|
| 171 |
+
"blocks.8.self_attn.k_proj.weight": "model-00004-of-00016.safetensors",
|
| 172 |
+
"blocks.8.self_attn.k_norm.weight": "model-00004-of-00016.safetensors",
|
| 173 |
+
"blocks.8.self_attn.v_proj.weight": "model-00004-of-00016.safetensors",
|
| 174 |
+
"blocks.8.self_attn.output_proj.weight": "model-00004-of-00016.safetensors",
|
| 175 |
+
"blocks.8.cross_attn.q_proj.weight": "model-00004-of-00016.safetensors",
|
| 176 |
+
"blocks.8.cross_attn.q_norm.weight": "model-00004-of-00016.safetensors",
|
| 177 |
+
"blocks.8.cross_attn.k_proj.weight": "model-00004-of-00016.safetensors",
|
| 178 |
+
"blocks.8.cross_attn.k_norm.weight": "model-00004-of-00016.safetensors",
|
| 179 |
+
"blocks.8.cross_attn.v_proj.weight": "model-00004-of-00016.safetensors",
|
| 180 |
+
"blocks.8.cross_attn.output_proj.weight": "model-00004-of-00016.safetensors",
|
| 181 |
+
"blocks.8.mlp.layer1.weight": "model-00004-of-00016.safetensors",
|
| 182 |
+
"blocks.8.mlp.layer2.weight": "model-00004-of-00016.safetensors",
|
| 183 |
+
"blocks.8.adaln_modulation_self_attn.1.weight": "model-00004-of-00016.safetensors",
|
| 184 |
+
"blocks.8.adaln_modulation_self_attn.2.weight": "model-00004-of-00016.safetensors",
|
| 185 |
+
"blocks.8.adaln_modulation_cross_attn.1.weight": "model-00004-of-00016.safetensors",
|
| 186 |
+
"blocks.8.adaln_modulation_cross_attn.2.weight": "model-00004-of-00016.safetensors",
|
| 187 |
+
"blocks.8.adaln_modulation_mlp.1.weight": "model-00004-of-00016.safetensors",
|
| 188 |
+
"blocks.8.adaln_modulation_mlp.2.weight": "model-00004-of-00016.safetensors",
|
| 189 |
+
"blocks.9.self_attn.q_proj.weight": "model-00004-of-00016.safetensors",
|
| 190 |
+
"blocks.9.self_attn.q_norm.weight": "model-00004-of-00016.safetensors",
|
| 191 |
+
"blocks.9.self_attn.k_proj.weight": "model-00004-of-00016.safetensors",
|
| 192 |
+
"blocks.9.self_attn.k_norm.weight": "model-00004-of-00016.safetensors",
|
| 193 |
+
"blocks.9.self_attn.v_proj.weight": "model-00004-of-00016.safetensors",
|
| 194 |
+
"blocks.9.self_attn.output_proj.weight": "model-00004-of-00016.safetensors",
|
| 195 |
+
"blocks.9.cross_attn.q_proj.weight": "model-00004-of-00016.safetensors",
|
| 196 |
+
"blocks.9.cross_attn.q_norm.weight": "model-00004-of-00016.safetensors",
|
| 197 |
+
"blocks.9.cross_attn.k_proj.weight": "model-00004-of-00016.safetensors",
|
| 198 |
+
"blocks.9.cross_attn.k_norm.weight": "model-00004-of-00016.safetensors",
|
| 199 |
+
"blocks.9.cross_attn.v_proj.weight": "model-00004-of-00016.safetensors",
|
| 200 |
+
"blocks.9.cross_attn.output_proj.weight": "model-00004-of-00016.safetensors",
|
| 201 |
+
"blocks.9.mlp.layer1.weight": "model-00004-of-00016.safetensors",
|
| 202 |
+
"blocks.9.mlp.layer2.weight": "model-00005-of-00016.safetensors",
|
| 203 |
+
"blocks.9.adaln_modulation_self_attn.1.weight": "model-00005-of-00016.safetensors",
|
| 204 |
+
"blocks.9.adaln_modulation_self_attn.2.weight": "model-00005-of-00016.safetensors",
|
| 205 |
+
"blocks.9.adaln_modulation_cross_attn.1.weight": "model-00005-of-00016.safetensors",
|
| 206 |
+
"blocks.9.adaln_modulation_cross_attn.2.weight": "model-00005-of-00016.safetensors",
|
| 207 |
+
"blocks.9.adaln_modulation_mlp.1.weight": "model-00005-of-00016.safetensors",
|
| 208 |
+
"blocks.9.adaln_modulation_mlp.2.weight": "model-00005-of-00016.safetensors",
|
| 209 |
+
"blocks.10.self_attn.q_proj.weight": "model-00005-of-00016.safetensors",
|
| 210 |
+
"blocks.10.self_attn.q_norm.weight": "model-00005-of-00016.safetensors",
|
| 211 |
+
"blocks.10.self_attn.k_proj.weight": "model-00005-of-00016.safetensors",
|
| 212 |
+
"blocks.10.self_attn.k_norm.weight": "model-00005-of-00016.safetensors",
|
| 213 |
+
"blocks.10.self_attn.v_proj.weight": "model-00005-of-00016.safetensors",
|
| 214 |
+
"blocks.10.self_attn.output_proj.weight": "model-00005-of-00016.safetensors",
|
| 215 |
+
"blocks.10.cross_attn.q_proj.weight": "model-00005-of-00016.safetensors",
|
| 216 |
+
"blocks.10.cross_attn.q_norm.weight": "model-00005-of-00016.safetensors",
|
| 217 |
+
"blocks.10.cross_attn.k_proj.weight": "model-00005-of-00016.safetensors",
|
| 218 |
+
"blocks.10.cross_attn.k_norm.weight": "model-00005-of-00016.safetensors",
|
| 219 |
+
"blocks.10.cross_attn.v_proj.weight": "model-00005-of-00016.safetensors",
|
| 220 |
+
"blocks.10.cross_attn.output_proj.weight": "model-00005-of-00016.safetensors",
|
| 221 |
+
"blocks.10.mlp.layer1.weight": "model-00005-of-00016.safetensors",
|
| 222 |
+
"blocks.10.mlp.layer2.weight": "model-00005-of-00016.safetensors",
|
| 223 |
+
"blocks.10.adaln_modulation_self_attn.1.weight": "model-00005-of-00016.safetensors",
|
| 224 |
+
"blocks.10.adaln_modulation_self_attn.2.weight": "model-00005-of-00016.safetensors",
|
| 225 |
+
"blocks.10.adaln_modulation_cross_attn.1.weight": "model-00005-of-00016.safetensors",
|
| 226 |
+
"blocks.10.adaln_modulation_cross_attn.2.weight": "model-00005-of-00016.safetensors",
|
| 227 |
+
"blocks.10.adaln_modulation_mlp.1.weight": "model-00005-of-00016.safetensors",
|
| 228 |
+
"blocks.10.adaln_modulation_mlp.2.weight": "model-00005-of-00016.safetensors",
|
| 229 |
+
"blocks.11.self_attn.q_proj.weight": "model-00005-of-00016.safetensors",
|
| 230 |
+
"blocks.11.self_attn.q_norm.weight": "model-00005-of-00016.safetensors",
|
| 231 |
+
"blocks.11.self_attn.k_proj.weight": "model-00005-of-00016.safetensors",
|
| 232 |
+
"blocks.11.self_attn.k_norm.weight": "model-00005-of-00016.safetensors",
|
| 233 |
+
"blocks.11.self_attn.v_proj.weight": "model-00005-of-00016.safetensors",
|
| 234 |
+
"blocks.11.self_attn.output_proj.weight": "model-00005-of-00016.safetensors",
|
| 235 |
+
"blocks.11.cross_attn.q_proj.weight": "model-00005-of-00016.safetensors",
|
| 236 |
+
"blocks.11.cross_attn.q_norm.weight": "model-00005-of-00016.safetensors",
|
| 237 |
+
"blocks.11.cross_attn.k_proj.weight": "model-00005-of-00016.safetensors",
|
| 238 |
+
"blocks.11.cross_attn.k_norm.weight": "model-00005-of-00016.safetensors",
|
| 239 |
+
"blocks.11.cross_attn.v_proj.weight": "model-00005-of-00016.safetensors",
|
| 240 |
+
"blocks.11.cross_attn.output_proj.weight": "model-00005-of-00016.safetensors",
|
| 241 |
+
"blocks.11.mlp.layer1.weight": "model-00005-of-00016.safetensors",
|
| 242 |
+
"blocks.11.mlp.layer2.weight": "model-00005-of-00016.safetensors",
|
| 243 |
+
"blocks.11.adaln_modulation_self_attn.1.weight": "model-00005-of-00016.safetensors",
|
| 244 |
+
"blocks.11.adaln_modulation_self_attn.2.weight": "model-00005-of-00016.safetensors",
|
| 245 |
+
"blocks.11.adaln_modulation_cross_attn.1.weight": "model-00005-of-00016.safetensors",
|
| 246 |
+
"blocks.11.adaln_modulation_cross_attn.2.weight": "model-00005-of-00016.safetensors",
|
| 247 |
+
"blocks.11.adaln_modulation_mlp.1.weight": "model-00005-of-00016.safetensors",
|
| 248 |
+
"blocks.11.adaln_modulation_mlp.2.weight": "model-00005-of-00016.safetensors",
|
| 249 |
+
"blocks.12.self_attn.q_proj.weight": "model-00005-of-00016.safetensors",
|
| 250 |
+
"blocks.12.self_attn.q_norm.weight": "model-00005-of-00016.safetensors",
|
| 251 |
+
"blocks.12.self_attn.k_proj.weight": "model-00005-of-00016.safetensors",
|
| 252 |
+
"blocks.12.self_attn.k_norm.weight": "model-00005-of-00016.safetensors",
|
| 253 |
+
"blocks.12.self_attn.v_proj.weight": "model-00005-of-00016.safetensors",
|
| 254 |
+
"blocks.12.self_attn.output_proj.weight": "model-00006-of-00016.safetensors",
|
| 255 |
+
"blocks.12.cross_attn.q_proj.weight": "model-00006-of-00016.safetensors",
|
| 256 |
+
"blocks.12.cross_attn.q_norm.weight": "model-00006-of-00016.safetensors",
|
| 257 |
+
"blocks.12.cross_attn.k_proj.weight": "model-00006-of-00016.safetensors",
|
| 258 |
+
"blocks.12.cross_attn.k_norm.weight": "model-00006-of-00016.safetensors",
|
| 259 |
+
"blocks.12.cross_attn.v_proj.weight": "model-00006-of-00016.safetensors",
|
| 260 |
+
"blocks.12.cross_attn.output_proj.weight": "model-00006-of-00016.safetensors",
|
| 261 |
+
"blocks.12.mlp.layer1.weight": "model-00006-of-00016.safetensors",
|
| 262 |
+
"blocks.12.mlp.layer2.weight": "model-00006-of-00016.safetensors",
|
| 263 |
+
"blocks.12.adaln_modulation_self_attn.1.weight": "model-00006-of-00016.safetensors",
|
| 264 |
+
"blocks.12.adaln_modulation_self_attn.2.weight": "model-00006-of-00016.safetensors",
|
| 265 |
+
"blocks.12.adaln_modulation_cross_attn.1.weight": "model-00006-of-00016.safetensors",
|
| 266 |
+
"blocks.12.adaln_modulation_cross_attn.2.weight": "model-00006-of-00016.safetensors",
|
| 267 |
+
"blocks.12.adaln_modulation_mlp.1.weight": "model-00006-of-00016.safetensors",
|
| 268 |
+
"blocks.12.adaln_modulation_mlp.2.weight": "model-00006-of-00016.safetensors",
|
| 269 |
+
"blocks.13.self_attn.q_proj.weight": "model-00006-of-00016.safetensors",
|
| 270 |
+
"blocks.13.self_attn.q_norm.weight": "model-00006-of-00016.safetensors",
|
| 271 |
+
"blocks.13.self_attn.k_proj.weight": "model-00006-of-00016.safetensors",
|
| 272 |
+
"blocks.13.self_attn.k_norm.weight": "model-00006-of-00016.safetensors",
|
| 273 |
+
"blocks.13.self_attn.v_proj.weight": "model-00006-of-00016.safetensors",
|
| 274 |
+
"blocks.13.self_attn.output_proj.weight": "model-00006-of-00016.safetensors",
|
| 275 |
+
"blocks.13.cross_attn.q_proj.weight": "model-00006-of-00016.safetensors",
|
| 276 |
+
"blocks.13.cross_attn.q_norm.weight": "model-00006-of-00016.safetensors",
|
| 277 |
+
"blocks.13.cross_attn.k_proj.weight": "model-00006-of-00016.safetensors",
|
| 278 |
+
"blocks.13.cross_attn.k_norm.weight": "model-00006-of-00016.safetensors",
|
| 279 |
+
"blocks.13.cross_attn.v_proj.weight": "model-00006-of-00016.safetensors",
|
| 280 |
+
"blocks.13.cross_attn.output_proj.weight": "model-00006-of-00016.safetensors",
|
| 281 |
+
"blocks.13.mlp.layer1.weight": "model-00006-of-00016.safetensors",
|
| 282 |
+
"blocks.13.mlp.layer2.weight": "model-00006-of-00016.safetensors",
|
| 283 |
+
"blocks.13.adaln_modulation_self_attn.1.weight": "model-00006-of-00016.safetensors",
|
| 284 |
+
"blocks.13.adaln_modulation_self_attn.2.weight": "model-00006-of-00016.safetensors",
|
| 285 |
+
"blocks.13.adaln_modulation_cross_attn.1.weight": "model-00006-of-00016.safetensors",
|
| 286 |
+
"blocks.13.adaln_modulation_cross_attn.2.weight": "model-00006-of-00016.safetensors",
|
| 287 |
+
"blocks.13.adaln_modulation_mlp.1.weight": "model-00006-of-00016.safetensors",
|
| 288 |
+
"blocks.13.adaln_modulation_mlp.2.weight": "model-00006-of-00016.safetensors",
|
| 289 |
+
"blocks.14.self_attn.q_proj.weight": "model-00006-of-00016.safetensors",
|
| 290 |
+
"blocks.14.self_attn.q_norm.weight": "model-00006-of-00016.safetensors",
|
| 291 |
+
"blocks.14.self_attn.k_proj.weight": "model-00006-of-00016.safetensors",
|
| 292 |
+
"blocks.14.self_attn.k_norm.weight": "model-00006-of-00016.safetensors",
|
| 293 |
+
"blocks.14.self_attn.v_proj.weight": "model-00006-of-00016.safetensors",
|
| 294 |
+
"blocks.14.self_attn.output_proj.weight": "model-00006-of-00016.safetensors",
|
| 295 |
+
"blocks.14.cross_attn.q_proj.weight": "model-00006-of-00016.safetensors",
|
| 296 |
+
"blocks.14.cross_attn.q_norm.weight": "model-00006-of-00016.safetensors",
|
| 297 |
+
"blocks.14.cross_attn.k_proj.weight": "model-00006-of-00016.safetensors",
|
| 298 |
+
"blocks.14.cross_attn.k_norm.weight": "model-00006-of-00016.safetensors",
|
| 299 |
+
"blocks.14.cross_attn.v_proj.weight": "model-00006-of-00016.safetensors",
|
| 300 |
+
"blocks.14.cross_attn.output_proj.weight": "model-00006-of-00016.safetensors",
|
| 301 |
+
"blocks.14.mlp.layer1.weight": "model-00006-of-00016.safetensors",
|
| 302 |
+
"blocks.14.mlp.layer2.weight": "model-00007-of-00016.safetensors",
|
| 303 |
+
"blocks.14.adaln_modulation_self_attn.1.weight": "model-00007-of-00016.safetensors",
|
| 304 |
+
"blocks.14.adaln_modulation_self_attn.2.weight": "model-00007-of-00016.safetensors",
|
| 305 |
+
"blocks.14.adaln_modulation_cross_attn.1.weight": "model-00007-of-00016.safetensors",
|
| 306 |
+
"blocks.14.adaln_modulation_cross_attn.2.weight": "model-00007-of-00016.safetensors",
|
| 307 |
+
"blocks.14.adaln_modulation_mlp.1.weight": "model-00007-of-00016.safetensors",
|
| 308 |
+
"blocks.14.adaln_modulation_mlp.2.weight": "model-00007-of-00016.safetensors",
|
| 309 |
+
"blocks.15.self_attn.q_proj.weight": "model-00007-of-00016.safetensors",
|
| 310 |
+
"blocks.15.self_attn.q_norm.weight": "model-00007-of-00016.safetensors",
|
| 311 |
+
"blocks.15.self_attn.k_proj.weight": "model-00007-of-00016.safetensors",
|
| 312 |
+
"blocks.15.self_attn.k_norm.weight": "model-00007-of-00016.safetensors",
|
| 313 |
+
"blocks.15.self_attn.v_proj.weight": "model-00007-of-00016.safetensors",
|
| 314 |
+
"blocks.15.self_attn.output_proj.weight": "model-00007-of-00016.safetensors",
|
| 315 |
+
"blocks.15.cross_attn.q_proj.weight": "model-00007-of-00016.safetensors",
|
| 316 |
+
"blocks.15.cross_attn.q_norm.weight": "model-00007-of-00016.safetensors",
|
| 317 |
+
"blocks.15.cross_attn.k_proj.weight": "model-00007-of-00016.safetensors",
|
| 318 |
+
"blocks.15.cross_attn.k_norm.weight": "model-00007-of-00016.safetensors",
|
| 319 |
+
"blocks.15.cross_attn.v_proj.weight": "model-00007-of-00016.safetensors",
|
| 320 |
+
"blocks.15.cross_attn.output_proj.weight": "model-00007-of-00016.safetensors",
|
| 321 |
+
"blocks.15.mlp.layer1.weight": "model-00007-of-00016.safetensors",
|
| 322 |
+
"blocks.15.mlp.layer2.weight": "model-00007-of-00016.safetensors",
|
| 323 |
+
"blocks.15.adaln_modulation_self_attn.1.weight": "model-00007-of-00016.safetensors",
|
| 324 |
+
"blocks.15.adaln_modulation_self_attn.2.weight": "model-00007-of-00016.safetensors",
|
| 325 |
+
"blocks.15.adaln_modulation_cross_attn.1.weight": "model-00007-of-00016.safetensors",
|
| 326 |
+
"blocks.15.adaln_modulation_cross_attn.2.weight": "model-00007-of-00016.safetensors",
|
| 327 |
+
"blocks.15.adaln_modulation_mlp.1.weight": "model-00007-of-00016.safetensors",
|
| 328 |
+
"blocks.15.adaln_modulation_mlp.2.weight": "model-00007-of-00016.safetensors",
|
| 329 |
+
"blocks.16.self_attn.q_proj.weight": "model-00007-of-00016.safetensors",
|
| 330 |
+
"blocks.16.self_attn.q_norm.weight": "model-00007-of-00016.safetensors",
|
| 331 |
+
"blocks.16.self_attn.k_proj.weight": "model-00007-of-00016.safetensors",
|
| 332 |
+
"blocks.16.self_attn.k_norm.weight": "model-00007-of-00016.safetensors",
|
| 333 |
+
"blocks.16.self_attn.v_proj.weight": "model-00007-of-00016.safetensors",
|
| 334 |
+
"blocks.16.self_attn.output_proj.weight": "model-00007-of-00016.safetensors",
|
| 335 |
+
"blocks.16.cross_attn.q_proj.weight": "model-00007-of-00016.safetensors",
|
| 336 |
+
"blocks.16.cross_attn.q_norm.weight": "model-00007-of-00016.safetensors",
|
| 337 |
+
"blocks.16.cross_attn.k_proj.weight": "model-00007-of-00016.safetensors",
|
| 338 |
+
"blocks.16.cross_attn.k_norm.weight": "model-00007-of-00016.safetensors",
|
| 339 |
+
"blocks.16.cross_attn.v_proj.weight": "model-00007-of-00016.safetensors",
|
| 340 |
+
"blocks.16.cross_attn.output_proj.weight": "model-00007-of-00016.safetensors",
|
| 341 |
+
"blocks.16.mlp.layer1.weight": "model-00007-of-00016.safetensors",
|
| 342 |
+
"blocks.16.mlp.layer2.weight": "model-00007-of-00016.safetensors",
|
| 343 |
+
"blocks.16.adaln_modulation_self_attn.1.weight": "model-00007-of-00016.safetensors",
|
| 344 |
+
"blocks.16.adaln_modulation_self_attn.2.weight": "model-00007-of-00016.safetensors",
|
| 345 |
+
"blocks.16.adaln_modulation_cross_attn.1.weight": "model-00007-of-00016.safetensors",
|
| 346 |
+
"blocks.16.adaln_modulation_cross_attn.2.weight": "model-00007-of-00016.safetensors",
|
| 347 |
+
"blocks.16.adaln_modulation_mlp.1.weight": "model-00007-of-00016.safetensors",
|
| 348 |
+
"blocks.16.adaln_modulation_mlp.2.weight": "model-00007-of-00016.safetensors",
|
| 349 |
+
"blocks.17.self_attn.q_proj.weight": "model-00007-of-00016.safetensors",
|
| 350 |
+
"blocks.17.self_attn.q_norm.weight": "model-00007-of-00016.safetensors",
|
| 351 |
+
"blocks.17.self_attn.k_proj.weight": "model-00007-of-00016.safetensors",
|
| 352 |
+
"blocks.17.self_attn.k_norm.weight": "model-00007-of-00016.safetensors",
|
| 353 |
+
"blocks.17.self_attn.v_proj.weight": "model-00007-of-00016.safetensors",
|
| 354 |
+
"blocks.17.self_attn.output_proj.weight": "model-00008-of-00016.safetensors",
|
| 355 |
+
"blocks.17.cross_attn.q_proj.weight": "model-00008-of-00016.safetensors",
|
| 356 |
+
"blocks.17.cross_attn.q_norm.weight": "model-00008-of-00016.safetensors",
|
| 357 |
+
"blocks.17.cross_attn.k_proj.weight": "model-00008-of-00016.safetensors",
|
| 358 |
+
"blocks.17.cross_attn.k_norm.weight": "model-00008-of-00016.safetensors",
|
| 359 |
+
"blocks.17.cross_attn.v_proj.weight": "model-00008-of-00016.safetensors",
|
| 360 |
+
"blocks.17.cross_attn.output_proj.weight": "model-00008-of-00016.safetensors",
|
| 361 |
+
"blocks.17.mlp.layer1.weight": "model-00008-of-00016.safetensors",
|
| 362 |
+
"blocks.17.mlp.layer2.weight": "model-00008-of-00016.safetensors",
|
| 363 |
+
"blocks.17.adaln_modulation_self_attn.1.weight": "model-00008-of-00016.safetensors",
|
| 364 |
+
"blocks.17.adaln_modulation_self_attn.2.weight": "model-00008-of-00016.safetensors",
|
| 365 |
+
"blocks.17.adaln_modulation_cross_attn.1.weight": "model-00008-of-00016.safetensors",
|
| 366 |
+
"blocks.17.adaln_modulation_cross_attn.2.weight": "model-00008-of-00016.safetensors",
|
| 367 |
+
"blocks.17.adaln_modulation_mlp.1.weight": "model-00008-of-00016.safetensors",
|
| 368 |
+
"blocks.17.adaln_modulation_mlp.2.weight": "model-00008-of-00016.safetensors",
|
| 369 |
+
"blocks.18.self_attn.q_proj.weight": "model-00008-of-00016.safetensors",
|
| 370 |
+
"blocks.18.self_attn.q_norm.weight": "model-00008-of-00016.safetensors",
|
| 371 |
+
"blocks.18.self_attn.k_proj.weight": "model-00008-of-00016.safetensors",
|
| 372 |
+
"blocks.18.self_attn.k_norm.weight": "model-00008-of-00016.safetensors",
|
| 373 |
+
"blocks.18.self_attn.v_proj.weight": "model-00008-of-00016.safetensors",
|
| 374 |
+
"blocks.18.self_attn.output_proj.weight": "model-00008-of-00016.safetensors",
|
| 375 |
+
"blocks.18.cross_attn.q_proj.weight": "model-00008-of-00016.safetensors",
|
| 376 |
+
"blocks.18.cross_attn.q_norm.weight": "model-00008-of-00016.safetensors",
|
| 377 |
+
"blocks.18.cross_attn.k_proj.weight": "model-00008-of-00016.safetensors",
|
| 378 |
+
"blocks.18.cross_attn.k_norm.weight": "model-00008-of-00016.safetensors",
|
| 379 |
+
"blocks.18.cross_attn.v_proj.weight": "model-00008-of-00016.safetensors",
|
| 380 |
+
"blocks.18.cross_attn.output_proj.weight": "model-00008-of-00016.safetensors",
|
| 381 |
+
"blocks.18.mlp.layer1.weight": "model-00008-of-00016.safetensors",
|
| 382 |
+
"blocks.18.mlp.layer2.weight": "model-00008-of-00016.safetensors",
|
| 383 |
+
"blocks.18.adaln_modulation_self_attn.1.weight": "model-00008-of-00016.safetensors",
|
| 384 |
+
"blocks.18.adaln_modulation_self_attn.2.weight": "model-00008-of-00016.safetensors",
|
| 385 |
+
"blocks.18.adaln_modulation_cross_attn.1.weight": "model-00008-of-00016.safetensors",
|
| 386 |
+
"blocks.18.adaln_modulation_cross_attn.2.weight": "model-00008-of-00016.safetensors",
|
| 387 |
+
"blocks.18.adaln_modulation_mlp.1.weight": "model-00008-of-00016.safetensors",
|
| 388 |
+
"blocks.18.adaln_modulation_mlp.2.weight": "model-00008-of-00016.safetensors",
|
| 389 |
+
"blocks.19.self_attn.q_proj.weight": "model-00008-of-00016.safetensors",
|
| 390 |
+
"blocks.19.self_attn.q_norm.weight": "model-00008-of-00016.safetensors",
|
| 391 |
+
"blocks.19.self_attn.k_proj.weight": "model-00008-of-00016.safetensors",
|
| 392 |
+
"blocks.19.self_attn.k_norm.weight": "model-00008-of-00016.safetensors",
|
| 393 |
+
"blocks.19.self_attn.v_proj.weight": "model-00008-of-00016.safetensors",
|
| 394 |
+
"blocks.19.self_attn.output_proj.weight": "model-00008-of-00016.safetensors",
|
| 395 |
+
"blocks.19.cross_attn.q_proj.weight": "model-00008-of-00016.safetensors",
|
| 396 |
+
"blocks.19.cross_attn.q_norm.weight": "model-00008-of-00016.safetensors",
|
| 397 |
+
"blocks.19.cross_attn.k_proj.weight": "model-00008-of-00016.safetensors",
|
| 398 |
+
"blocks.19.cross_attn.k_norm.weight": "model-00008-of-00016.safetensors",
|
| 399 |
+
"blocks.19.cross_attn.v_proj.weight": "model-00008-of-00016.safetensors",
|
| 400 |
+
"blocks.19.cross_attn.output_proj.weight": "model-00008-of-00016.safetensors",
|
| 401 |
+
"blocks.19.mlp.layer1.weight": "model-00008-of-00016.safetensors",
|
| 402 |
+
"blocks.19.mlp.layer2.weight": "model-00009-of-00016.safetensors",
|
| 403 |
+
"blocks.19.adaln_modulation_self_attn.1.weight": "model-00009-of-00016.safetensors",
|
| 404 |
+
"blocks.19.adaln_modulation_self_attn.2.weight": "model-00009-of-00016.safetensors",
|
| 405 |
+
"blocks.19.adaln_modulation_cross_attn.1.weight": "model-00009-of-00016.safetensors",
|
| 406 |
+
"blocks.19.adaln_modulation_cross_attn.2.weight": "model-00009-of-00016.safetensors",
|
| 407 |
+
"blocks.19.adaln_modulation_mlp.1.weight": "model-00009-of-00016.safetensors",
|
| 408 |
+
"blocks.19.adaln_modulation_mlp.2.weight": "model-00009-of-00016.safetensors",
|
| 409 |
+
"blocks.20.self_attn.q_proj.weight": "model-00009-of-00016.safetensors",
|
| 410 |
+
"blocks.20.self_attn.q_norm.weight": "model-00009-of-00016.safetensors",
|
| 411 |
+
"blocks.20.self_attn.k_proj.weight": "model-00009-of-00016.safetensors",
|
| 412 |
+
"blocks.20.self_attn.k_norm.weight": "model-00009-of-00016.safetensors",
|
| 413 |
+
"blocks.20.self_attn.v_proj.weight": "model-00009-of-00016.safetensors",
|
| 414 |
+
"blocks.20.self_attn.output_proj.weight": "model-00009-of-00016.safetensors",
|
| 415 |
+
"blocks.20.cross_attn.q_proj.weight": "model-00009-of-00016.safetensors",
|
| 416 |
+
"blocks.20.cross_attn.q_norm.weight": "model-00009-of-00016.safetensors",
|
| 417 |
+
"blocks.20.cross_attn.k_proj.weight": "model-00009-of-00016.safetensors",
|
| 418 |
+
"blocks.20.cross_attn.k_norm.weight": "model-00009-of-00016.safetensors",
|
| 419 |
+
"blocks.20.cross_attn.v_proj.weight": "model-00009-of-00016.safetensors",
|
| 420 |
+
"blocks.20.cross_attn.output_proj.weight": "model-00009-of-00016.safetensors",
|
| 421 |
+
"blocks.20.mlp.layer1.weight": "model-00009-of-00016.safetensors",
|
| 422 |
+
"blocks.20.mlp.layer2.weight": "model-00009-of-00016.safetensors",
|
| 423 |
+
"blocks.20.adaln_modulation_self_attn.1.weight": "model-00009-of-00016.safetensors",
|
| 424 |
+
"blocks.20.adaln_modulation_self_attn.2.weight": "model-00009-of-00016.safetensors",
|
| 425 |
+
"blocks.20.adaln_modulation_cross_attn.1.weight": "model-00009-of-00016.safetensors",
|
| 426 |
+
"blocks.20.adaln_modulation_cross_attn.2.weight": "model-00009-of-00016.safetensors",
|
| 427 |
+
"blocks.20.adaln_modulation_mlp.1.weight": "model-00009-of-00016.safetensors",
|
| 428 |
+
"blocks.20.adaln_modulation_mlp.2.weight": "model-00009-of-00016.safetensors",
|
| 429 |
+
"blocks.21.self_attn.q_proj.weight": "model-00009-of-00016.safetensors",
|
| 430 |
+
"blocks.21.self_attn.q_norm.weight": "model-00009-of-00016.safetensors",
|
| 431 |
+
"blocks.21.self_attn.k_proj.weight": "model-00009-of-00016.safetensors",
|
| 432 |
+
"blocks.21.self_attn.k_norm.weight": "model-00009-of-00016.safetensors",
|
| 433 |
+
"blocks.21.self_attn.v_proj.weight": "model-00009-of-00016.safetensors",
|
| 434 |
+
"blocks.21.self_attn.output_proj.weight": "model-00009-of-00016.safetensors",
|
| 435 |
+
"blocks.21.cross_attn.q_proj.weight": "model-00009-of-00016.safetensors",
|
| 436 |
+
"blocks.21.cross_attn.q_norm.weight": "model-00009-of-00016.safetensors",
|
| 437 |
+
"blocks.21.cross_attn.k_proj.weight": "model-00009-of-00016.safetensors",
|
| 438 |
+
"blocks.21.cross_attn.k_norm.weight": "model-00009-of-00016.safetensors",
|
| 439 |
+
"blocks.21.cross_attn.v_proj.weight": "model-00009-of-00016.safetensors",
|
| 440 |
+
"blocks.21.cross_attn.output_proj.weight": "model-00009-of-00016.safetensors",
|
| 441 |
+
"blocks.21.mlp.layer1.weight": "model-00009-of-00016.safetensors",
|
| 442 |
+
"blocks.21.mlp.layer2.weight": "model-00009-of-00016.safetensors",
|
| 443 |
+
"blocks.21.adaln_modulation_self_attn.1.weight": "model-00009-of-00016.safetensors",
|
| 444 |
+
"blocks.21.adaln_modulation_self_attn.2.weight": "model-00009-of-00016.safetensors",
|
| 445 |
+
"blocks.21.adaln_modulation_cross_attn.1.weight": "model-00009-of-00016.safetensors",
|
| 446 |
+
"blocks.21.adaln_modulation_cross_attn.2.weight": "model-00009-of-00016.safetensors",
|
| 447 |
+
"blocks.21.adaln_modulation_mlp.1.weight": "model-00009-of-00016.safetensors",
|
| 448 |
+
"blocks.21.adaln_modulation_mlp.2.weight": "model-00009-of-00016.safetensors",
|
| 449 |
+
"blocks.22.self_attn.q_proj.weight": "model-00009-of-00016.safetensors",
|
| 450 |
+
"blocks.22.self_attn.q_norm.weight": "model-00009-of-00016.safetensors",
|
| 451 |
+
"blocks.22.self_attn.k_proj.weight": "model-00009-of-00016.safetensors",
|
| 452 |
+
"blocks.22.self_attn.k_norm.weight": "model-00009-of-00016.safetensors",
|
| 453 |
+
"blocks.22.self_attn.v_proj.weight": "model-00009-of-00016.safetensors",
|
| 454 |
+
"blocks.22.self_attn.output_proj.weight": "model-00010-of-00016.safetensors",
|
| 455 |
+
"blocks.22.cross_attn.q_proj.weight": "model-00010-of-00016.safetensors",
|
| 456 |
+
"blocks.22.cross_attn.q_norm.weight": "model-00010-of-00016.safetensors",
|
| 457 |
+
"blocks.22.cross_attn.k_proj.weight": "model-00010-of-00016.safetensors",
|
| 458 |
+
"blocks.22.cross_attn.k_norm.weight": "model-00010-of-00016.safetensors",
|
| 459 |
+
"blocks.22.cross_attn.v_proj.weight": "model-00010-of-00016.safetensors",
|
| 460 |
+
"blocks.22.cross_attn.output_proj.weight": "model-00010-of-00016.safetensors",
|
| 461 |
+
"blocks.22.mlp.layer1.weight": "model-00010-of-00016.safetensors",
|
| 462 |
+
"blocks.22.mlp.layer2.weight": "model-00010-of-00016.safetensors",
|
| 463 |
+
"blocks.22.adaln_modulation_self_attn.1.weight": "model-00010-of-00016.safetensors",
|
| 464 |
+
"blocks.22.adaln_modulation_self_attn.2.weight": "model-00010-of-00016.safetensors",
|
| 465 |
+
"blocks.22.adaln_modulation_cross_attn.1.weight": "model-00010-of-00016.safetensors",
|
| 466 |
+
"blocks.22.adaln_modulation_cross_attn.2.weight": "model-00010-of-00016.safetensors",
|
| 467 |
+
"blocks.22.adaln_modulation_mlp.1.weight": "model-00010-of-00016.safetensors",
|
| 468 |
+
"blocks.22.adaln_modulation_mlp.2.weight": "model-00010-of-00016.safetensors",
|
| 469 |
+
"blocks.23.self_attn.q_proj.weight": "model-00010-of-00016.safetensors",
|
| 470 |
+
"blocks.23.self_attn.q_norm.weight": "model-00010-of-00016.safetensors",
|
| 471 |
+
"blocks.23.self_attn.k_proj.weight": "model-00010-of-00016.safetensors",
|
| 472 |
+
"blocks.23.self_attn.k_norm.weight": "model-00010-of-00016.safetensors",
|
| 473 |
+
"blocks.23.self_attn.v_proj.weight": "model-00010-of-00016.safetensors",
|
| 474 |
+
"blocks.23.self_attn.output_proj.weight": "model-00010-of-00016.safetensors",
|
| 475 |
+
"blocks.23.cross_attn.q_proj.weight": "model-00010-of-00016.safetensors",
|
| 476 |
+
"blocks.23.cross_attn.q_norm.weight": "model-00010-of-00016.safetensors",
|
| 477 |
+
"blocks.23.cross_attn.k_proj.weight": "model-00010-of-00016.safetensors",
|
| 478 |
+
"blocks.23.cross_attn.k_norm.weight": "model-00010-of-00016.safetensors",
|
| 479 |
+
"blocks.23.cross_attn.v_proj.weight": "model-00010-of-00016.safetensors",
|
| 480 |
+
"blocks.23.cross_attn.output_proj.weight": "model-00010-of-00016.safetensors",
|
| 481 |
+
"blocks.23.mlp.layer1.weight": "model-00010-of-00016.safetensors",
|
| 482 |
+
"blocks.23.mlp.layer2.weight": "model-00010-of-00016.safetensors",
|
| 483 |
+
"blocks.23.adaln_modulation_self_attn.1.weight": "model-00010-of-00016.safetensors",
|
| 484 |
+
"blocks.23.adaln_modulation_self_attn.2.weight": "model-00010-of-00016.safetensors",
|
| 485 |
+
"blocks.23.adaln_modulation_cross_attn.1.weight": "model-00010-of-00016.safetensors",
|
| 486 |
+
"blocks.23.adaln_modulation_cross_attn.2.weight": "model-00010-of-00016.safetensors",
|
| 487 |
+
"blocks.23.adaln_modulation_mlp.1.weight": "model-00010-of-00016.safetensors",
|
| 488 |
+
"blocks.23.adaln_modulation_mlp.2.weight": "model-00010-of-00016.safetensors",
|
| 489 |
+
"blocks.24.self_attn.q_proj.weight": "model-00010-of-00016.safetensors",
|
| 490 |
+
"blocks.24.self_attn.q_norm.weight": "model-00010-of-00016.safetensors",
|
| 491 |
+
"blocks.24.self_attn.k_proj.weight": "model-00010-of-00016.safetensors",
|
| 492 |
+
"blocks.24.self_attn.k_norm.weight": "model-00010-of-00016.safetensors",
|
| 493 |
+
"blocks.24.self_attn.v_proj.weight": "model-00010-of-00016.safetensors",
|
| 494 |
+
"blocks.24.self_attn.output_proj.weight": "model-00010-of-00016.safetensors",
|
| 495 |
+
"blocks.24.cross_attn.q_proj.weight": "model-00010-of-00016.safetensors",
|
| 496 |
+
"blocks.24.cross_attn.q_norm.weight": "model-00010-of-00016.safetensors",
|
| 497 |
+
"blocks.24.cross_attn.k_proj.weight": "model-00010-of-00016.safetensors",
|
| 498 |
+
"blocks.24.cross_attn.k_norm.weight": "model-00010-of-00016.safetensors",
|
| 499 |
+
"blocks.24.cross_attn.v_proj.weight": "model-00010-of-00016.safetensors",
|
| 500 |
+
"blocks.24.cross_attn.output_proj.weight": "model-00010-of-00016.safetensors",
|
| 501 |
+
"blocks.24.mlp.layer1.weight": "model-00010-of-00016.safetensors",
|
| 502 |
+
"blocks.24.mlp.layer2.weight": "model-00011-of-00016.safetensors",
|
| 503 |
+
"blocks.24.adaln_modulation_self_attn.1.weight": "model-00011-of-00016.safetensors",
|
| 504 |
+
"blocks.24.adaln_modulation_self_attn.2.weight": "model-00011-of-00016.safetensors",
|
| 505 |
+
"blocks.24.adaln_modulation_cross_attn.1.weight": "model-00011-of-00016.safetensors",
|
| 506 |
+
"blocks.24.adaln_modulation_cross_attn.2.weight": "model-00011-of-00016.safetensors",
|
| 507 |
+
"blocks.24.adaln_modulation_mlp.1.weight": "model-00011-of-00016.safetensors",
|
| 508 |
+
"blocks.24.adaln_modulation_mlp.2.weight": "model-00011-of-00016.safetensors",
|
| 509 |
+
"blocks.25.self_attn.q_proj.weight": "model-00011-of-00016.safetensors",
|
| 510 |
+
"blocks.25.self_attn.q_norm.weight": "model-00011-of-00016.safetensors",
|
| 511 |
+
"blocks.25.self_attn.k_proj.weight": "model-00011-of-00016.safetensors",
|
| 512 |
+
"blocks.25.self_attn.k_norm.weight": "model-00011-of-00016.safetensors",
|
| 513 |
+
"blocks.25.self_attn.v_proj.weight": "model-00011-of-00016.safetensors",
|
| 514 |
+
"blocks.25.self_attn.output_proj.weight": "model-00011-of-00016.safetensors",
|
| 515 |
+
"blocks.25.cross_attn.q_proj.weight": "model-00011-of-00016.safetensors",
|
| 516 |
+
"blocks.25.cross_attn.q_norm.weight": "model-00011-of-00016.safetensors",
|
| 517 |
+
"blocks.25.cross_attn.k_proj.weight": "model-00011-of-00016.safetensors",
|
| 518 |
+
"blocks.25.cross_attn.k_norm.weight": "model-00011-of-00016.safetensors",
|
| 519 |
+
"blocks.25.cross_attn.v_proj.weight": "model-00011-of-00016.safetensors",
|
| 520 |
+
"blocks.25.cross_attn.output_proj.weight": "model-00011-of-00016.safetensors",
|
| 521 |
+
"blocks.25.mlp.layer1.weight": "model-00011-of-00016.safetensors",
|
| 522 |
+
"blocks.25.mlp.layer2.weight": "model-00011-of-00016.safetensors",
|
| 523 |
+
"blocks.25.adaln_modulation_self_attn.1.weight": "model-00011-of-00016.safetensors",
|
| 524 |
+
"blocks.25.adaln_modulation_self_attn.2.weight": "model-00011-of-00016.safetensors",
|
| 525 |
+
"blocks.25.adaln_modulation_cross_attn.1.weight": "model-00011-of-00016.safetensors",
|
| 526 |
+
"blocks.25.adaln_modulation_cross_attn.2.weight": "model-00011-of-00016.safetensors",
|
| 527 |
+
"blocks.25.adaln_modulation_mlp.1.weight": "model-00011-of-00016.safetensors",
|
| 528 |
+
"blocks.25.adaln_modulation_mlp.2.weight": "model-00011-of-00016.safetensors",
|
| 529 |
+
"blocks.26.self_attn.q_proj.weight": "model-00011-of-00016.safetensors",
|
| 530 |
+
"blocks.26.self_attn.q_norm.weight": "model-00011-of-00016.safetensors",
|
| 531 |
+
"blocks.26.self_attn.k_proj.weight": "model-00011-of-00016.safetensors",
|
| 532 |
+
"blocks.26.self_attn.k_norm.weight": "model-00011-of-00016.safetensors",
|
| 533 |
+
"blocks.26.self_attn.v_proj.weight": "model-00011-of-00016.safetensors",
|
| 534 |
+
"blocks.26.self_attn.output_proj.weight": "model-00011-of-00016.safetensors",
|
| 535 |
+
"blocks.26.cross_attn.q_proj.weight": "model-00011-of-00016.safetensors",
|
| 536 |
+
"blocks.26.cross_attn.q_norm.weight": "model-00011-of-00016.safetensors",
|
| 537 |
+
"blocks.26.cross_attn.k_proj.weight": "model-00011-of-00016.safetensors",
|
| 538 |
+
"blocks.26.cross_attn.k_norm.weight": "model-00011-of-00016.safetensors",
|
| 539 |
+
"blocks.26.cross_attn.v_proj.weight": "model-00011-of-00016.safetensors",
|
| 540 |
+
"blocks.26.cross_attn.output_proj.weight": "model-00011-of-00016.safetensors",
|
| 541 |
+
"blocks.26.mlp.layer1.weight": "model-00011-of-00016.safetensors",
|
| 542 |
+
"blocks.26.mlp.layer2.weight": "model-00011-of-00016.safetensors",
|
| 543 |
+
"blocks.26.adaln_modulation_self_attn.1.weight": "model-00011-of-00016.safetensors",
|
| 544 |
+
"blocks.26.adaln_modulation_self_attn.2.weight": "model-00011-of-00016.safetensors",
|
| 545 |
+
"blocks.26.adaln_modulation_cross_attn.1.weight": "model-00011-of-00016.safetensors",
|
| 546 |
+
"blocks.26.adaln_modulation_cross_attn.2.weight": "model-00011-of-00016.safetensors",
|
| 547 |
+
"blocks.26.adaln_modulation_mlp.1.weight": "model-00011-of-00016.safetensors",
|
| 548 |
+
"blocks.26.adaln_modulation_mlp.2.weight": "model-00011-of-00016.safetensors",
|
| 549 |
+
"blocks.27.self_attn.q_proj.weight": "model-00011-of-00016.safetensors",
|
| 550 |
+
"blocks.27.self_attn.q_norm.weight": "model-00011-of-00016.safetensors",
|
| 551 |
+
"blocks.27.self_attn.k_proj.weight": "model-00011-of-00016.safetensors",
|
| 552 |
+
"blocks.27.self_attn.k_norm.weight": "model-00011-of-00016.safetensors",
|
| 553 |
+
"blocks.27.self_attn.v_proj.weight": "model-00011-of-00016.safetensors",
|
| 554 |
+
"blocks.27.self_attn.output_proj.weight": "model-00012-of-00016.safetensors",
|
| 555 |
+
"blocks.27.cross_attn.q_proj.weight": "model-00012-of-00016.safetensors",
|
| 556 |
+
"blocks.27.cross_attn.q_norm.weight": "model-00012-of-00016.safetensors",
|
| 557 |
+
"blocks.27.cross_attn.k_proj.weight": "model-00012-of-00016.safetensors",
|
| 558 |
+
"blocks.27.cross_attn.k_norm.weight": "model-00012-of-00016.safetensors",
|
| 559 |
+
"blocks.27.cross_attn.v_proj.weight": "model-00012-of-00016.safetensors",
|
| 560 |
+
"blocks.27.cross_attn.output_proj.weight": "model-00012-of-00016.safetensors",
|
| 561 |
+
"blocks.27.mlp.layer1.weight": "model-00012-of-00016.safetensors",
|
| 562 |
+
"blocks.27.mlp.layer2.weight": "model-00012-of-00016.safetensors",
|
| 563 |
+
"blocks.27.adaln_modulation_self_attn.1.weight": "model-00012-of-00016.safetensors",
|
| 564 |
+
"blocks.27.adaln_modulation_self_attn.2.weight": "model-00012-of-00016.safetensors",
|
| 565 |
+
"blocks.27.adaln_modulation_cross_attn.1.weight": "model-00012-of-00016.safetensors",
|
| 566 |
+
"blocks.27.adaln_modulation_cross_attn.2.weight": "model-00012-of-00016.safetensors",
|
| 567 |
+
"blocks.27.adaln_modulation_mlp.1.weight": "model-00012-of-00016.safetensors",
|
| 568 |
+
"blocks.27.adaln_modulation_mlp.2.weight": "model-00012-of-00016.safetensors",
|
| 569 |
+
"blocks.28.self_attn.q_proj.weight": "model-00012-of-00016.safetensors",
|
| 570 |
+
"blocks.28.self_attn.q_norm.weight": "model-00012-of-00016.safetensors",
|
| 571 |
+
"blocks.28.self_attn.k_proj.weight": "model-00012-of-00016.safetensors",
|
| 572 |
+
"blocks.28.self_attn.k_norm.weight": "model-00012-of-00016.safetensors",
|
| 573 |
+
"blocks.28.self_attn.v_proj.weight": "model-00012-of-00016.safetensors",
|
| 574 |
+
"blocks.28.self_attn.output_proj.weight": "model-00012-of-00016.safetensors",
|
| 575 |
+
"blocks.28.cross_attn.q_proj.weight": "model-00012-of-00016.safetensors",
|
| 576 |
+
"blocks.28.cross_attn.q_norm.weight": "model-00012-of-00016.safetensors",
|
| 577 |
+
"blocks.28.cross_attn.k_proj.weight": "model-00012-of-00016.safetensors",
|
| 578 |
+
"blocks.28.cross_attn.k_norm.weight": "model-00012-of-00016.safetensors",
|
| 579 |
+
"blocks.28.cross_attn.v_proj.weight": "model-00012-of-00016.safetensors",
|
| 580 |
+
"blocks.28.cross_attn.output_proj.weight": "model-00012-of-00016.safetensors",
|
| 581 |
+
"blocks.28.mlp.layer1.weight": "model-00012-of-00016.safetensors",
|
| 582 |
+
"blocks.28.mlp.layer2.weight": "model-00012-of-00016.safetensors",
|
| 583 |
+
"blocks.28.adaln_modulation_self_attn.1.weight": "model-00012-of-00016.safetensors",
|
| 584 |
+
"blocks.28.adaln_modulation_self_attn.2.weight": "model-00012-of-00016.safetensors",
|
| 585 |
+
"blocks.28.adaln_modulation_cross_attn.1.weight": "model-00012-of-00016.safetensors",
|
| 586 |
+
"blocks.28.adaln_modulation_cross_attn.2.weight": "model-00012-of-00016.safetensors",
|
| 587 |
+
"blocks.28.adaln_modulation_mlp.1.weight": "model-00012-of-00016.safetensors",
|
| 588 |
+
"blocks.28.adaln_modulation_mlp.2.weight": "model-00012-of-00016.safetensors",
|
| 589 |
+
"blocks.29.self_attn.q_proj.weight": "model-00012-of-00016.safetensors",
|
| 590 |
+
"blocks.29.self_attn.q_norm.weight": "model-00012-of-00016.safetensors",
|
| 591 |
+
"blocks.29.self_attn.k_proj.weight": "model-00012-of-00016.safetensors",
|
| 592 |
+
"blocks.29.self_attn.k_norm.weight": "model-00012-of-00016.safetensors",
|
| 593 |
+
"blocks.29.self_attn.v_proj.weight": "model-00012-of-00016.safetensors",
|
| 594 |
+
"blocks.29.self_attn.output_proj.weight": "model-00012-of-00016.safetensors",
|
| 595 |
+
"blocks.29.cross_attn.q_proj.weight": "model-00012-of-00016.safetensors",
|
| 596 |
+
"blocks.29.cross_attn.q_norm.weight": "model-00012-of-00016.safetensors",
|
| 597 |
+
"blocks.29.cross_attn.k_proj.weight": "model-00012-of-00016.safetensors",
|
| 598 |
+
"blocks.29.cross_attn.k_norm.weight": "model-00012-of-00016.safetensors",
|
| 599 |
+
"blocks.29.cross_attn.v_proj.weight": "model-00012-of-00016.safetensors",
|
| 600 |
+
"blocks.29.cross_attn.output_proj.weight": "model-00012-of-00016.safetensors",
|
| 601 |
+
"blocks.29.mlp.layer1.weight": "model-00012-of-00016.safetensors",
|
| 602 |
+
"blocks.29.mlp.layer2.weight": "model-00013-of-00016.safetensors",
|
| 603 |
+
"blocks.29.adaln_modulation_self_attn.1.weight": "model-00013-of-00016.safetensors",
|
| 604 |
+
"blocks.29.adaln_modulation_self_attn.2.weight": "model-00013-of-00016.safetensors",
|
| 605 |
+
"blocks.29.adaln_modulation_cross_attn.1.weight": "model-00013-of-00016.safetensors",
|
| 606 |
+
"blocks.29.adaln_modulation_cross_attn.2.weight": "model-00013-of-00016.safetensors",
|
| 607 |
+
"blocks.29.adaln_modulation_mlp.1.weight": "model-00013-of-00016.safetensors",
|
| 608 |
+
"blocks.29.adaln_modulation_mlp.2.weight": "model-00013-of-00016.safetensors",
|
| 609 |
+
"blocks.30.self_attn.q_proj.weight": "model-00013-of-00016.safetensors",
|
| 610 |
+
"blocks.30.self_attn.q_norm.weight": "model-00013-of-00016.safetensors",
|
| 611 |
+
"blocks.30.self_attn.k_proj.weight": "model-00013-of-00016.safetensors",
|
| 612 |
+
"blocks.30.self_attn.k_norm.weight": "model-00013-of-00016.safetensors",
|
| 613 |
+
"blocks.30.self_attn.v_proj.weight": "model-00013-of-00016.safetensors",
|
| 614 |
+
"blocks.30.self_attn.output_proj.weight": "model-00013-of-00016.safetensors",
|
| 615 |
+
"blocks.30.cross_attn.q_proj.weight": "model-00013-of-00016.safetensors",
|
| 616 |
+
"blocks.30.cross_attn.q_norm.weight": "model-00013-of-00016.safetensors",
|
| 617 |
+
"blocks.30.cross_attn.k_proj.weight": "model-00013-of-00016.safetensors",
|
| 618 |
+
"blocks.30.cross_attn.k_norm.weight": "model-00013-of-00016.safetensors",
|
| 619 |
+
"blocks.30.cross_attn.v_proj.weight": "model-00013-of-00016.safetensors",
|
| 620 |
+
"blocks.30.cross_attn.output_proj.weight": "model-00013-of-00016.safetensors",
|
| 621 |
+
"blocks.30.mlp.layer1.weight": "model-00013-of-00016.safetensors",
|
| 622 |
+
"blocks.30.mlp.layer2.weight": "model-00013-of-00016.safetensors",
|
| 623 |
+
"blocks.30.adaln_modulation_self_attn.1.weight": "model-00013-of-00016.safetensors",
|
| 624 |
+
"blocks.30.adaln_modulation_self_attn.2.weight": "model-00013-of-00016.safetensors",
|
| 625 |
+
"blocks.30.adaln_modulation_cross_attn.1.weight": "model-00013-of-00016.safetensors",
|
| 626 |
+
"blocks.30.adaln_modulation_cross_attn.2.weight": "model-00013-of-00016.safetensors",
|
| 627 |
+
"blocks.30.adaln_modulation_mlp.1.weight": "model-00013-of-00016.safetensors",
|
| 628 |
+
"blocks.30.adaln_modulation_mlp.2.weight": "model-00013-of-00016.safetensors",
|
| 629 |
+
"blocks.31.self_attn.q_proj.weight": "model-00013-of-00016.safetensors",
|
| 630 |
+
"blocks.31.self_attn.q_norm.weight": "model-00013-of-00016.safetensors",
|
| 631 |
+
"blocks.31.self_attn.k_proj.weight": "model-00013-of-00016.safetensors",
|
| 632 |
+
"blocks.31.self_attn.k_norm.weight": "model-00013-of-00016.safetensors",
|
| 633 |
+
"blocks.31.self_attn.v_proj.weight": "model-00013-of-00016.safetensors",
|
| 634 |
+
"blocks.31.self_attn.output_proj.weight": "model-00013-of-00016.safetensors",
|
| 635 |
+
"blocks.31.cross_attn.q_proj.weight": "model-00013-of-00016.safetensors",
|
| 636 |
+
"blocks.31.cross_attn.q_norm.weight": "model-00013-of-00016.safetensors",
|
| 637 |
+
"blocks.31.cross_attn.k_proj.weight": "model-00013-of-00016.safetensors",
|
| 638 |
+
"blocks.31.cross_attn.k_norm.weight": "model-00013-of-00016.safetensors",
|
| 639 |
+
"blocks.31.cross_attn.v_proj.weight": "model-00013-of-00016.safetensors",
|
| 640 |
+
"blocks.31.cross_attn.output_proj.weight": "model-00013-of-00016.safetensors",
|
| 641 |
+
"blocks.31.mlp.layer1.weight": "model-00013-of-00016.safetensors",
|
| 642 |
+
"blocks.31.mlp.layer2.weight": "model-00013-of-00016.safetensors",
|
| 643 |
+
"blocks.31.adaln_modulation_self_attn.1.weight": "model-00013-of-00016.safetensors",
|
| 644 |
+
"blocks.31.adaln_modulation_self_attn.2.weight": "model-00013-of-00016.safetensors",
|
| 645 |
+
"blocks.31.adaln_modulation_cross_attn.1.weight": "model-00013-of-00016.safetensors",
|
| 646 |
+
"blocks.31.adaln_modulation_cross_attn.2.weight": "model-00013-of-00016.safetensors",
|
| 647 |
+
"blocks.31.adaln_modulation_mlp.1.weight": "model-00013-of-00016.safetensors",
|
| 648 |
+
"blocks.31.adaln_modulation_mlp.2.weight": "model-00013-of-00016.safetensors",
|
| 649 |
+
"blocks.32.self_attn.q_proj.weight": "model-00013-of-00016.safetensors",
|
| 650 |
+
"blocks.32.self_attn.q_norm.weight": "model-00013-of-00016.safetensors",
|
| 651 |
+
"blocks.32.self_attn.k_proj.weight": "model-00013-of-00016.safetensors",
|
| 652 |
+
"blocks.32.self_attn.k_norm.weight": "model-00013-of-00016.safetensors",
|
| 653 |
+
"blocks.32.self_attn.v_proj.weight": "model-00013-of-00016.safetensors",
|
| 654 |
+
"blocks.32.self_attn.output_proj.weight": "model-00014-of-00016.safetensors",
|
| 655 |
+
"blocks.32.cross_attn.q_proj.weight": "model-00014-of-00016.safetensors",
|
| 656 |
+
"blocks.32.cross_attn.q_norm.weight": "model-00014-of-00016.safetensors",
|
| 657 |
+
"blocks.32.cross_attn.k_proj.weight": "model-00014-of-00016.safetensors",
|
| 658 |
+
"blocks.32.cross_attn.k_norm.weight": "model-00014-of-00016.safetensors",
|
| 659 |
+
"blocks.32.cross_attn.v_proj.weight": "model-00014-of-00016.safetensors",
|
| 660 |
+
"blocks.32.cross_attn.output_proj.weight": "model-00014-of-00016.safetensors",
|
| 661 |
+
"blocks.32.mlp.layer1.weight": "model-00014-of-00016.safetensors",
|
| 662 |
+
"blocks.32.mlp.layer2.weight": "model-00014-of-00016.safetensors",
|
| 663 |
+
"blocks.32.adaln_modulation_self_attn.1.weight": "model-00014-of-00016.safetensors",
|
| 664 |
+
"blocks.32.adaln_modulation_self_attn.2.weight": "model-00014-of-00016.safetensors",
|
| 665 |
+
"blocks.32.adaln_modulation_cross_attn.1.weight": "model-00014-of-00016.safetensors",
|
| 666 |
+
"blocks.32.adaln_modulation_cross_attn.2.weight": "model-00014-of-00016.safetensors",
|
| 667 |
+
"blocks.32.adaln_modulation_mlp.1.weight": "model-00014-of-00016.safetensors",
|
| 668 |
+
"blocks.32.adaln_modulation_mlp.2.weight": "model-00014-of-00016.safetensors",
|
| 669 |
+
"blocks.33.self_attn.q_proj.weight": "model-00014-of-00016.safetensors",
|
| 670 |
+
"blocks.33.self_attn.q_norm.weight": "model-00014-of-00016.safetensors",
|
| 671 |
+
"blocks.33.self_attn.k_proj.weight": "model-00014-of-00016.safetensors",
|
| 672 |
+
"blocks.33.self_attn.k_norm.weight": "model-00014-of-00016.safetensors",
|
| 673 |
+
"blocks.33.self_attn.v_proj.weight": "model-00014-of-00016.safetensors",
|
| 674 |
+
"blocks.33.self_attn.output_proj.weight": "model-00014-of-00016.safetensors",
|
| 675 |
+
"blocks.33.cross_attn.q_proj.weight": "model-00014-of-00016.safetensors",
|
| 676 |
+
"blocks.33.cross_attn.q_norm.weight": "model-00014-of-00016.safetensors",
|
| 677 |
+
"blocks.33.cross_attn.k_proj.weight": "model-00014-of-00016.safetensors",
|
| 678 |
+
"blocks.33.cross_attn.k_norm.weight": "model-00014-of-00016.safetensors",
|
| 679 |
+
"blocks.33.cross_attn.v_proj.weight": "model-00014-of-00016.safetensors",
|
| 680 |
+
"blocks.33.cross_attn.output_proj.weight": "model-00014-of-00016.safetensors",
|
| 681 |
+
"blocks.33.mlp.layer1.weight": "model-00014-of-00016.safetensors",
|
| 682 |
+
"blocks.33.mlp.layer2.weight": "model-00014-of-00016.safetensors",
|
| 683 |
+
"blocks.33.adaln_modulation_self_attn.1.weight": "model-00014-of-00016.safetensors",
|
| 684 |
+
"blocks.33.adaln_modulation_self_attn.2.weight": "model-00014-of-00016.safetensors",
|
| 685 |
+
"blocks.33.adaln_modulation_cross_attn.1.weight": "model-00014-of-00016.safetensors",
|
| 686 |
+
"blocks.33.adaln_modulation_cross_attn.2.weight": "model-00014-of-00016.safetensors",
|
| 687 |
+
"blocks.33.adaln_modulation_mlp.1.weight": "model-00014-of-00016.safetensors",
|
| 688 |
+
"blocks.33.adaln_modulation_mlp.2.weight": "model-00014-of-00016.safetensors",
|
| 689 |
+
"blocks.34.self_attn.q_proj.weight": "model-00014-of-00016.safetensors",
|
| 690 |
+
"blocks.34.self_attn.q_norm.weight": "model-00014-of-00016.safetensors",
|
| 691 |
+
"blocks.34.self_attn.k_proj.weight": "model-00014-of-00016.safetensors",
|
| 692 |
+
"blocks.34.self_attn.k_norm.weight": "model-00014-of-00016.safetensors",
|
| 693 |
+
"blocks.34.self_attn.v_proj.weight": "model-00014-of-00016.safetensors",
|
| 694 |
+
"blocks.34.self_attn.output_proj.weight": "model-00014-of-00016.safetensors",
|
| 695 |
+
"blocks.34.cross_attn.q_proj.weight": "model-00014-of-00016.safetensors",
|
| 696 |
+
"blocks.34.cross_attn.q_norm.weight": "model-00014-of-00016.safetensors",
|
| 697 |
+
"blocks.34.cross_attn.k_proj.weight": "model-00014-of-00016.safetensors",
|
| 698 |
+
"blocks.34.cross_attn.k_norm.weight": "model-00014-of-00016.safetensors",
|
| 699 |
+
"blocks.34.cross_attn.v_proj.weight": "model-00014-of-00016.safetensors",
|
| 700 |
+
"blocks.34.cross_attn.output_proj.weight": "model-00014-of-00016.safetensors",
|
| 701 |
+
"blocks.34.mlp.layer1.weight": "model-00014-of-00016.safetensors",
|
| 702 |
+
"blocks.34.mlp.layer2.weight": "model-00015-of-00016.safetensors",
|
| 703 |
+
"blocks.34.adaln_modulation_self_attn.1.weight": "model-00015-of-00016.safetensors",
|
| 704 |
+
"blocks.34.adaln_modulation_self_attn.2.weight": "model-00015-of-00016.safetensors",
|
| 705 |
+
"blocks.34.adaln_modulation_cross_attn.1.weight": "model-00015-of-00016.safetensors",
|
| 706 |
+
"blocks.34.adaln_modulation_cross_attn.2.weight": "model-00015-of-00016.safetensors",
|
| 707 |
+
"blocks.34.adaln_modulation_mlp.1.weight": "model-00015-of-00016.safetensors",
|
| 708 |
+
"blocks.34.adaln_modulation_mlp.2.weight": "model-00015-of-00016.safetensors",
|
| 709 |
+
"blocks.35.self_attn.q_proj.weight": "model-00015-of-00016.safetensors",
|
| 710 |
+
"blocks.35.self_attn.q_norm.weight": "model-00015-of-00016.safetensors",
|
| 711 |
+
"blocks.35.self_attn.k_proj.weight": "model-00015-of-00016.safetensors",
|
| 712 |
+
"blocks.35.self_attn.k_norm.weight": "model-00015-of-00016.safetensors",
|
| 713 |
+
"blocks.35.self_attn.v_proj.weight": "model-00015-of-00016.safetensors",
|
| 714 |
+
"blocks.35.self_attn.output_proj.weight": "model-00015-of-00016.safetensors",
|
| 715 |
+
"blocks.35.cross_attn.q_proj.weight": "model-00015-of-00016.safetensors",
|
| 716 |
+
"blocks.35.cross_attn.q_norm.weight": "model-00015-of-00016.safetensors",
|
| 717 |
+
"blocks.35.cross_attn.k_proj.weight": "model-00015-of-00016.safetensors",
|
| 718 |
+
"blocks.35.cross_attn.k_norm.weight": "model-00015-of-00016.safetensors",
|
| 719 |
+
"blocks.35.cross_attn.v_proj.weight": "model-00015-of-00016.safetensors",
|
| 720 |
+
"blocks.35.cross_attn.output_proj.weight": "model-00015-of-00016.safetensors",
|
| 721 |
+
"blocks.35.mlp.layer1.weight": "model-00015-of-00016.safetensors",
|
| 722 |
+
"blocks.35.mlp.layer2.weight": "model-00015-of-00016.safetensors",
|
| 723 |
+
"blocks.35.adaln_modulation_self_attn.1.weight": "model-00015-of-00016.safetensors",
|
| 724 |
+
"blocks.35.adaln_modulation_self_attn.2.weight": "model-00015-of-00016.safetensors",
|
| 725 |
+
"blocks.35.adaln_modulation_cross_attn.1.weight": "model-00015-of-00016.safetensors",
|
| 726 |
+
"blocks.35.adaln_modulation_cross_attn.2.weight": "model-00015-of-00016.safetensors",
|
| 727 |
+
"blocks.35.adaln_modulation_mlp.1.weight": "model-00015-of-00016.safetensors",
|
| 728 |
+
"blocks.35.adaln_modulation_mlp.2.weight": "model-00015-of-00016.safetensors",
|
| 729 |
+
"final_layer.linear.weight": "model-00015-of-00016.safetensors",
|
| 730 |
+
"final_layer.adaln_modulation.1.weight": "model-00015-of-00016.safetensors",
|
| 731 |
+
"final_layer.adaln_modulation.2.weight": "model-00015-of-00016.safetensors",
|
| 732 |
+
"t_embedding_norm.weight": "model-00015-of-00016.safetensors",
|
| 733 |
+
"action_embedder_B_D.fc1.weight": "model-00015-of-00016.safetensors",
|
| 734 |
+
"action_embedder_B_D.fc1.bias": "model-00015-of-00016.safetensors",
|
| 735 |
+
"action_embedder_B_D.fc2.weight": "model-00015-of-00016.safetensors",
|
| 736 |
+
"action_embedder_B_D.fc2.bias": "model-00015-of-00016.safetensors",
|
| 737 |
+
"action_embedder_B_3D.fc1.weight": "model-00015-of-00016.safetensors",
|
| 738 |
+
"action_embedder_B_3D.fc1.bias": "model-00015-of-00016.safetensors",
|
| 739 |
+
"action_embedder_B_3D.fc2.weight": "model-00016-of-00016.safetensors",
|
| 740 |
+
"action_embedder_B_3D.fc2.bias": "model-00016-of-00016.safetensors"
|
| 741 |
+
}
|
| 742 |
+
}
|
vae/config.json
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "AutoencoderKLWan",
|
| 3 |
+
"_diffusers_version": "0.33.0.dev0",
|
| 4 |
+
"attn_scales": [],
|
| 5 |
+
"base_dim": 96,
|
| 6 |
+
"dim_mult": [
|
| 7 |
+
1,
|
| 8 |
+
2,
|
| 9 |
+
4,
|
| 10 |
+
4
|
| 11 |
+
],
|
| 12 |
+
"dropout": 0.0,
|
| 13 |
+
"latents_mean": [
|
| 14 |
+
-0.7571,
|
| 15 |
+
-0.7089,
|
| 16 |
+
-0.9113,
|
| 17 |
+
0.1075,
|
| 18 |
+
-0.1745,
|
| 19 |
+
0.9653,
|
| 20 |
+
-0.1517,
|
| 21 |
+
1.5508,
|
| 22 |
+
0.4134,
|
| 23 |
+
-0.0715,
|
| 24 |
+
0.5517,
|
| 25 |
+
-0.3632,
|
| 26 |
+
-0.1922,
|
| 27 |
+
-0.9497,
|
| 28 |
+
0.2503,
|
| 29 |
+
-0.2921
|
| 30 |
+
],
|
| 31 |
+
"latents_std": [
|
| 32 |
+
2.8184,
|
| 33 |
+
1.4541,
|
| 34 |
+
2.3275,
|
| 35 |
+
2.6558,
|
| 36 |
+
1.2196,
|
| 37 |
+
1.7708,
|
| 38 |
+
2.6052,
|
| 39 |
+
2.0743,
|
| 40 |
+
3.2687,
|
| 41 |
+
2.1526,
|
| 42 |
+
2.8652,
|
| 43 |
+
1.5579,
|
| 44 |
+
1.6382,
|
| 45 |
+
1.1253,
|
| 46 |
+
2.8251,
|
| 47 |
+
1.916
|
| 48 |
+
],
|
| 49 |
+
"num_res_blocks": 2,
|
| 50 |
+
"temperal_downsample": [
|
| 51 |
+
false,
|
| 52 |
+
true,
|
| 53 |
+
true
|
| 54 |
+
],
|
| 55 |
+
"z_dim": 16
|
| 56 |
+
}
|
vae/diffusion_pytorch_model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d6e524b3fffede1787a74e81b30976dce5400c4439ba64222168e607ed19e793
|
| 3 |
+
size 507591892
|