ECD_Finetuned_MLLMs / README.md

Ruler138

Improve model card: add correct pipeline tag, library_name, and relevant links (#1)

33a3af9 verified 3 months ago

preview code

raw

history blame contribute delete

1.46 kB

metadata

base_model:
  - llava-hf/llama3-llava-next-8b-hf
  - openbmb/MiniCPM-V-2_6
  - microsoft/Phi-3-vision-128k-instruct
  - Qwen/Qwen2.5-VL-7B-Instruct
license: mit
metrics:
  - accuracy
pipeline_tag: image-text-to-text
library_name: transformers

The following models are obtained via supervised fine-tuning (SFT) using the ECD-10k-Images dataset (URL) proposed in our ICCV 2025 paper, "Effective Training Data Synthesis for Improving MLLM Chart Understanding" (Code).

ECD Dataset Overview:

Comparing 4 MLLMs on six test sets: (CharXiv, ChartQA, ReachQA, ChartBench, ChartX, ECDBench)

Citation:

If it is helpful to your research, please cite our paper as follows:

@inproceedings{yang2025effective,
     title={Effective Training Data Synthesis for Improving MLLM Chart Understanding},
     author={Yang, Yuwei and Zhang, Zeyu and Hou, Yunzhong and Li, Zhuowan and Liu, Gaowen and Payani, Ali and Ting, Yuan-Sen and Zheng, Liang},
     booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
     year={2025}
 }