luodian
/

OTTER-Image-LLaMA7B-LA-InContext

Transformers

PyTorch

otter

Inference Endpoints

Model card Files Files and versions Community

luodian commited on Jun 13, 2023

Commit

922bffe

1 Parent(s): 15554b9

Update README.md

Browse files

Files changed (1) hide show

README.md +12 -95

README.md CHANGED Viewed

@@ -3,37 +3,27 @@ license: other
 ---
 <p align="center" width="100%">
-<img src="https://i.postimg.cc/CLPnPvZW/title.png"  width="80%" height="80%">
 </p>
 <div>
 <div align="center">
-    <a href='https://brianboli.com/' target='_blank'>Bo Li*</a>&emsp;
-    <a href='https://zhangyuanhan-ai.github.io/' target='_blank'>Yuanhan Zhang*</a>&emsp;
-    <a href='https://cliangyu.com/' target='_blank'>Liangyu Chen*</a>&emsp;
-    <a href='https://king159.github.io/' target='_blank'>Jinghao Wang*</a>&emsp;
     </br>
-    <a href='https://jingkang50.github.io/' target='_blank'>Jingkang Yang</a>&emsp;
-    <a href='https://liuziwei7.github.io/' target='_blank'>Ziwei Liu</a>
 </div>
 <div>
 <div align="center">
-    S-Lab, Nanyang Technological University
 </div>
- -----------------
-![](https://img.shields.io/badge/otter-v0.1-darkcyan)
-![](https://img.shields.io/github/stars/luodian/otter?style=social)
-[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FLuodian%2Fotter&count_bg=%23FFA500&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=visitors&edge_flat=false)](https://hits.seeyoufarm.com)
-<!-- [![](https://img.shields.io/badge/demo-online-orange)](https://otter.cliangyu.com) -->
-![](https://black.readthedocs.io/en/stable/_static/license.svg)
-![](https://img.shields.io/badge/code%20style-black-000000.svg)
- [Otter-9B (Huggingface Models)](https://huggingface.co/luodian/otter-9b-hf) | [Youtube Video](https://youtu.be/r-YM4DGGAdE) | [Bilibili Video](https://www.bilibili.com/video/BV1iL411h7HZ/?share_source=copy_web&vd_source=477facaaaa60694f67a784f5eaa905ad)
-[Live Demo (soon)](https://otter.cliangyu.com/) | [Paper (soon)]()
 ## 🦦 Simple Code For Otter-9B
@@ -91,77 +81,4 @@ generated_text = model.generate(
 )
 print("Generated text: ", model.text_tokenizer.decode(generated_text[0]))
-```
-## 🦦 Overview
-<div style="text-align:center">
-<img src="https://i.postimg.cc/Z5fkydMP/teaser.png"  width="100%" height="100%">
-</div>
-Recent research highlights the importance of fine-tuning instruction for empowering large language models (LLMs), such as enhancing GPT-3 to Chat-GPT, to follow natural language instructions and effectively accomplish real-world tasks. Flamingo is considered a GPT-3 moment in the multimodal domain.
-In our project, we propose 🦦 Otter, an instruction-tuned model built upon OpenFlamingo that has been customized for a context. We improve its conversational skills by using a carefully crafted multimodal instruction tuning dataset. Each data sample includes an image-specific instruction along with multiple examples of multimodal instructions for that context, also known as multimodal in-context learning examples.
-By utilizing high-quality data, we were able to train 🦦 Otter using limited resources (4x RTX-3090-24G GPUs) in our lab. Remarkably, it surpassed the performance of OpenFlamingo. While Otter may not be the most advanced and may occasionally experience confusion, we are committed to consistently enhancing its capabilities by including more types of training data and a larger model. In the current era of expansive foundational models, we firmly believe that anyone should have the opportunity to train their own models, even with scarce data and resources, and cultivate the models to develop their intelligence.
-## 🦦 Examples
-<div style="text-align:center">
-<img src="https://i.postimg.cc/KYqmWG7j/example-description2.png"  width="100%" height="100%">
-</div>
----
-<div style="text-align:center">
-<img src="https://i.postimg.cc/FRYh5MGZ/example-description.png"  width="100%" height="100%">
-</div>
----
-<div style="text-align:center">
-<img src="https://i.postimg.cc/YSqp8GWT/example-understanding.png"  width="100%" height="100%">
-</div>
----
-<div style="text-align:center">
-<img src="https://i.postimg.cc/FzjKJbjJ/examples-ict.png"  width="100%" height="100%">
-</div>
----
-<div style="text-align:center">
-<img src="https://i.postimg.cc/JnBrfwzL/examples-ict2.png"  width="100%" height="100%">
-</div>
-## 🗂️ Environments
-You may install via `conda env create -f environment.yml`. Especially to make sure the `transformers>=4.28.0`, `accelerate>=0.18.0`.
-## 🤗 Hugging Face Model
-You can use the 🦩 Flamingo model / 🦦 Otter model as a 🤗 huggingface model with only a few lines! One-click and then model configs/weights are downloaded automatically.
-``` python
-from flamingo import FlamingoModel
-flamingo_model = FlamingoModel.from_pretrained("luodian/openflamingo-9b-hf", device_map=auto)
-from otter import OtterModel
-otter_model = OtterModel.from_pretrained("luodian/otter-9b-hf", device_map=auto)
-```
-Previous [OpenFlamingo](https://github.com/mlfoundations/open_flamingo) was developed with [DistributedDataParallel](https://pytorch.org/docs/stable/nn.html#torch.nn.parallel.DistributedDataParallel) (DDP) on A100 cluster. Loading OpenFlamingo-9B to GPU requires **at least 33G GPU memory**, which is only available on A100 GPUs.
-In order to allow more researchers without access to A100 machines to try training OpenFlamingo, we wrap the OpenFlamingo model into a 🤗 huggingface model ([Jinghao](https://king159.github.io/) has submitted a [PR](https://github.com/huggingface/transformers/pull/23063) to the /huggingface/transformers!). Via `device_map=auto`, the large model is sharded across multiple GPUs when loading and training. This can help researchers who do not have access to A100-80G GPUs to achieve similar throughput in training, testing on 4x RTX-3090-24G GPUs, and model deployment on 2x RTX-3090-24G GPUs. Specific details are below (may vary depending on the CPU and disk performance, as we conducted training on different machines).
-<div style="text-align:center">
-<img src="https://i.postimg.cc/LsNs55zG/table.png"  width="100%" height="100%">
-</div>
----
-<div style="text-align:center">
-<img src="https://i.postimg.cc/tTcCdcv5/efficiency.png"  width="100%" height="100%">
-</div>
-Our Otter model is also developed in this way and it's deployed on the 🤗 Hugging Face model hub. Our model can be hosted on two RTX-3090-24G GPUs and achieve a similar speed to one A100-80G machine.

 ---
 <p align="center" width="100%">
+<img src="https://i.postimg.cc/MKmyP9wH/new-banner.png"  width="80%" height="80%">
 </p>
 <div>
 <div align="center">
+    <a href='https://brianboli.com/' target='_blank'>Bo Li*<sup>1</sup></a>&emsp;
+    <a href='https://zhangyuanhan-ai.github.io/' target='_blank'>Yuanhan Zhang*<sup>,1</sup></a>&emsp;
+    <a href='https://cliangyu.com/' target='_blank'>Liangyu Chen*<sup>,1</sup></a>&emsp;
+    <a href='https://king159.github.io/' target='_blank'>Jinghao Wang*<sup>,1</sup></a>&emsp;
+    <a href='https://pufanyi.github.io/' target='_blank'>Fanyi Pu*<sup>,1</sup></a>&emsp;
     </br>
+    <a href='https://jingkang50.github.io/' target='_blank'>Jingkang Yang<sup>1</sup></a>&emsp;
+    <a href='https://chunyuan.li/' target='_blank'>Chunyuan Li<sup>2</sup></a>&emsp;
+    <a href='https://liuziwei7.github.io/' target='_blank'>Ziwei Liu<sup>1</sup></a>
 </div>
 <div>
 <div align="center">
+    <sup>1</sup>S-Lab, Nanyang Technological University&emsp;
+    <sup>2</sup>Microsoft Research, Redmond
 </div>
 ## 🦦 Simple Code For Otter-9B
 )
 print("Generated text: ", model.text_tokenizer.decode(generated_text[0]))
+```