docs: add demos

Files changed (4) hide show

README.md CHANGED Viewed

@@ -13,7 +13,10 @@ tags:
 ### Model Description
-blip2zh-chatglm-6b是基于blip2训练的中文多模态聊天模型。
 - **blip2 base model**: [bert-base-chinese](https://huggingface.co/bert-base-chinese)
 - **Vision encoder**: eva-clip-vit-g
@@ -27,7 +30,7 @@ blip2zh-chatglm-6b是基于blip2训练的中文多模态聊天模型。
 ## Uses
-模型参数包含了图像编码器和blip2，但是不包含chatglm的参数，需要事先下载[chatglm](https://huggingface.co/THUDM/chatglm-6b)([commit](https://huggingface.co/THUDM/chatglm-6b/commit/9324de70a93207c9a310cf99d5d6261791489691))并安装其对应的依赖。
 加载模型及推理可以参考[api](https://github.com/XiPotatonium/chatbot-api/blob/main/src/model/blip2chatglm/__init__.py)的实现
@@ -43,7 +46,7 @@ blip2zh-chatglm-6b是基于blip2训练的中文多模态聊天模型。
 ### Training Data
-* [laion-2b-chinese](https://huggingface.co/datasets/IDEA-CCNL/laion2B-multi-chinese-subset): 我们仅选取了其中clip分数较高的670k图文对。
 * [coco-zh](https://github.com/li-xirong/coco-cn)
 * [flickr8k-zh](http://lixirong.net/datasets/flickr8kcn)
@@ -51,6 +54,8 @@ blip2zh-chatglm-6b是基于blip2训练的中文多模态聊天模型。
 基于blip2的两阶段训练方法
-## Evaluation
-TODO

 ### Model Description
+blip2zh-chatglm-6b是基于blip2训练的中文多模态聊天模型。具有基本的图像理解能力。
+由于blip2的训练方式不会对语言模型进行微调，因此在纯文本对话中的行为可以保持和原始chatglm一致。
+注意：由于目前模型仅经过blip2两阶段图文对齐预训练，没有包括vqa或者指令微调等具体下游任务的训练，因此依然容易生成不符合预期的内容。
 - **blip2 base model**: [bert-base-chinese](https://huggingface.co/bert-base-chinese)
 - **Vision encoder**: eva-clip-vit-g
 ## Uses
+模型参数包含了图像编码器，blip2和chatglm-6b。
 加载模型及推理可以参考[api](https://github.com/XiPotatonium/chatbot-api/blob/main/src/model/blip2chatglm/__init__.py)的实现
 ### Training Data
+* [laion-2b-chinese](https://huggingface.co/datasets/IDEA-CCNL/laion2B-multi-chinese-subset): 我们仅选取了其中clip分数较高的670k图文对并采样了部分数据进行训练。
 * [coco-zh](https://github.com/li-xirong/coco-cn)
 * [flickr8k-zh](http://lixirong.net/datasets/flickr8kcn)
 基于blip2的两阶段训练方法
+## Demos
+![](imgs/demo1.png)
+![](imgs/demo2.png)
+![](imgs/demo3.png)

imgs/demo1.png ADDED Viewed

imgs/demo2.png ADDED Viewed

imgs/demo3.png ADDED Viewed