Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,9 @@ datasets:
|
|
15 |
|
16 |
# Llama-2-13b-deepspeed-visualchat
|
17 |
|
|
|
|
|
|
|
18 |
DeepSpeed-VisualChat is a scalable, efficient, and user-friendly multi-modal training pipeline that leverages a novel multi-modal causal attention mechanism for better alignment of visual and text features. It uses data blending techniques to address the scarcity of interleaved text-and-image inputs in datasets.
|
19 |
|
20 |
|
|
|
15 |
|
16 |
# Llama-2-13b-deepspeed-visualchat
|
17 |
|
18 |
+
> [!NOTE]
|
19 |
+
> ATTENTION: this encoder needs QwenCLIP model
|
20 |
+
|
21 |
DeepSpeed-VisualChat is a scalable, efficient, and user-friendly multi-modal training pipeline that leverages a novel multi-modal causal attention mechanism for better alignment of visual and text features. It uses data blending techniques to address the scarcity of interleaved text-and-image inputs in datasets.
|
22 |
|
23 |
|