metadata
			base_model:
  - llava-hf/llama3-llava-next-8b-hf
  - openbmb/MiniCPM-V-2_6
  - microsoft/Phi-3-vision-128k-instruct
  - Qwen/Qwen2.5-VL-7B-Instruct
license: mit
metrics:
  - accuracy
pipeline_tag: image-text-to-text
library_name: transformers
The following models are obtained via supervised fine-tuning (SFT) using the ECD-10k-Images dataset (URL) proposed in our ICCV 2025 paper, "Effective Training Data Synthesis for Improving MLLM Chart Understanding" (Code).
Comparing 4 MLLMs on six test sets: (CharXiv, ChartQA, ReachQA, ChartBench, ChartX, ECDBench)

Citation:
If it is helpful to your research, please cite our paper as follows:
@inproceedings{yang2025effective,
     title={Effective Training Data Synthesis for Improving MLLM Chart Understanding},
     author={Yang, Yuwei and Zhang, Zeyu and Hou, Yunzhong and Li, Zhuowan and Liu, Gaowen and Payani, Ali and Ting, Yuan-Sen and Zheng, Liang},
     booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
     year={2025}
 }
