xiaoxia-microsoft
commited on
Commit
•
cbb4b96
1
Parent(s):
cdef571
Update README.md
Browse files
README.md
CHANGED
@@ -20,4 +20,4 @@ datasets:
|
|
20 |
DeepSpeed-VisualChat is a scalable, efficient, and user-friendly multi-modal training pipeline that leverages a novel multi-modal causal attention mechanism for better alignment of visual and text features. It uses data blending techniques to address the scarcity of interleaved text-and-image inputs in datasets.
|
21 |
|
22 |
|
23 |
-
The framework trains using a 2B visual encoder from QWen-VL and a 70B language decoder from LLaMA-2, showcasing its extraordinary scalability. DeepSpeed-VisualChat is now open-sourced and encourages community contributions and collaborations. Visit the GitHub page to get started.
|
|
|
20 |
DeepSpeed-VisualChat is a scalable, efficient, and user-friendly multi-modal training pipeline that leverages a novel multi-modal causal attention mechanism for better alignment of visual and text features. It uses data blending techniques to address the scarcity of interleaved text-and-image inputs in datasets.
|
21 |
|
22 |
|
23 |
+
The framework trains using a 2B visual encoder from QWen-VL and a 13B-70B language decoder from LLaMA-2, showcasing its extraordinary scalability. DeepSpeed-VisualChat is now open-sourced and encourages community contributions and collaborations. Visit the GitHub page to get started.
|