Update README.md
Browse files
README.md
CHANGED
|
@@ -199,4 +199,21 @@ If you find *LLaVA-OneVision-1.5* useful in your research, please consider to ci
|
|
| 199 |
journal={Transactions on Machine Learning Research}
|
| 200 |
year={2024}
|
| 201 |
}
|
| 202 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 199 |
journal={Transactions on Machine Learning Research}
|
| 200 |
year={2024}
|
| 201 |
}
|
| 202 |
+
```
|
| 203 |
+
|
| 204 |
+
|
| 205 |
+
## Acknowledgement
|
| 206 |
+
|
| 207 |
+
We extend our sincere gratitude to **AIAK team of the** [**Baige AI computing platform**](https://cloud.baidu.com/product/aihc.html) **from Baidu AI Cloud** for providing the exceptional training framework. The outstanding capabilities of AIAK-Training-LLM and AIAK-Megatron have significantly accelerated our training process with remarkable efficiency. These cutting-edge frameworks have been instrumental in achieving our research goals. `To get full AIAK support, you can contact Baidu Cloud.`
|
| 208 |
+
|
| 209 |
+
We acknowledge the support of [Synvo AI](https://synvo.ai/) for contributing to the partial data annotation in this work, and also thank the maintainers and contributors of the following open-source projects, whose work greatly inspired and supported our research:
|
| 210 |
+
|
| 211 |
+
- LLaVA: Large Language-and-Vision Assistant β [LLaVA](https://github.com/haotian-liu/LLaVA)
|
| 212 |
+
- LLaVA-NeXT: Next-generation multi-modal assistant β [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT)
|
| 213 |
+
- lmms-eval: A standardized evaluation framework for Large Multimodal Models β [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval)
|
| 214 |
+
- Megatron-LM: Efficient, scalable training for large language models β [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
|
| 215 |
+
- Qwen2.5-VL: Strong vision-language foundation model β [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL)
|
| 216 |
+
- InternVL: Open-source large-scale vision-language foundation model β [InternVL](https://github.com/OpenGVLab/InternVL)
|
| 217 |
+
- Qwen3: Next-generation Qwen LLM β [Qwen](https://github.com/QwenLM/Qwen)
|
| 218 |
+
- MetaCLIP: Scalable contrastive pretraining β [MetaCLIP](https://github.com/facebookresearch/MetaCLIP)
|
| 219 |
+
- FineVision: Open Data Is All You Need β [FineVision](https://huggingface.co/spaces/HuggingFaceM4/FineVision)
|