teowu's picture
Create README.md
0f25913
|
raw
history blame
1.21 kB
This is a model based on InternLM-XComposer,
```bibtex
@misc{zhang2023internlmxcomposer,
title={InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition},
author={Pan Zhang and Xiaoyi Dong and Bin Wang and Yuhang Cao and Chao Xu and Linke Ouyang and Zhiyuan Zhao and Shuangrui Ding and Songyang Zhang and Haodong Duan and Wenwei Zhang and Hang Yan and Xinyue Zhang and Wei Li and Jingwen Li and Kai Chen and Conghui He and Xingcheng Zhang and Yu Qiao and Dahua Lin and Jiaqi Wang},
year={2023},
eprint={2309.15112},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
with additional [Q-Instruct] data to specially work on low-level visual perception tasks:
```bibtex
@misc{wu2023qinstruct,
title={Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models},
author={Haoning Wu and Zicheng Zhang and Erli Zhang and Chaofeng Chen and Liang Liao and Annan Wang and Kaixin Xu and Chunyi Li and Jingwen Hou and Guangtao Zhai and Geng Xue and Wenxiu Sun and Qiong Yan and Weisi Lin},
year={2023},
eprint={2311.06783},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```