metadata
license: mit
datasets:
- YuukiAsuna/VietnameseTableVQA
language:
- vi
base_model:
- 5CD-AI/Vintern-1B-v2
pipeline_tag: document-question-answering
library_name: transformers
Model Card for Model ID
Vintern-1B-v2-ViTable-docvqa is a fine-tuned version of the 5CD-AI/Vintern-1B-v2 multimodal model for the Vietnamese DocVQA (Table data)
Benchmarks
To be developed later
Quickstart
To be developed later
Citation:
@misc{doan2024vintern1befficientmultimodallarge,
title={Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese},
author={Khang T. Doan and Bao G. Huynh and Dung T. Hoang and Thuc D. Pham and Nhat H. Pham and Quan T. M. Nguyen and Bang Q. Vo and Suong N. Hoang},
year={2024},
eprint={2408.12480},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2408.12480},
}