AdaptLLM commited on
Commit
34da571
·
verified ·
1 Parent(s): 6eb426c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -5
README.md CHANGED
@@ -1,5 +1,16 @@
1
- ---
2
- license: other
3
- license_name: bigai
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Adapting Multimodal Large Language Models to Domains vis Post-Training
2
+
3
+ This repository contains the implementation preview of our paper `On Domain-Specific Post-Training for Multimodal Large Language Models`.
4
+
5
+ We investigates domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation.
6
+ (1) **Data Synthesis**: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse visual instruction tasks from domain-specific image-caption pairs. Our synthetic tasks surpass those generated by manual rules, GPT-4, and GPT-4V in enhancing the domain-specific performance of MLLMs.
7
+ (2) **Training Pipeline**: While the two-stage training—initially on image-caption pairs followed by visual instruction tasks—is commonly adopted for developing general MLLMs, we apply a single-stage training pipeline to enhance task diversity for domain-specific post-training.
8
+ (3) **Task Evaluation**: We conduct experiments in two domains, biomedicine and food, by post-training MLLMs of different sources and scales (Qwen2-VL-2B, LLaVA-v1.6-8B, Llama-3.2-11B), and then evaluating MLLM performance on various domain-specific tasks.
9
+ To support further research in MLLM domain adaptation, we will open-source our implementations at [https://github.com/bigai-ai](https://github.com/bigai-ai).
10
+
11
+ <p align='center'>
12
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/iklQIKW_6TyCT13BMq5-d.png" width="600">
13
+ </p>
14
+
15
+ ******* **Update** *********
16
+ - [2024/11/28] Realeased our paper.