YuukiAsuna
/

Vintern-1B-v2-ViTable-docvqa

Document Question Answering

feature-extraction

Model card Files Files and versions Community

Vintern-1B-v2-ViTable-docvqa / README.md

YuukiAsuna's picture

Update README.md

f8b2073 verified 5 days ago

|

history blame contribute delete

956 Bytes

	---
	license: mit
	datasets:
	- YuukiAsuna/VietnameseTableVQA
	language:
	- vi
	base_model:
	- 5CD-AI/Vintern-1B-v2
	pipeline_tag: document-question-answering
	library_name: transformers
	---
	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->
	Vintern-1B-v2-ViTable-docvqa is a fine-tuned version of the 5CD-AI/Vintern-1B-v2 multimodal model for the Vietnamese DocVQA (Table data)


	## Benchmarks

	To be developed later

	## Quickstart

	To be developed later

	Citation:

	```bibtex
	@misc{doan2024vintern1befficientmultimodallarge,
	title={Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese},
	author={Khang T. Doan and Bao G. Huynh and Dung T. Hoang and Thuc D. Pham and Nhat H. Pham and Quan T. M. Nguyen and Bang Q. Vo and Suong N. Hoang},
	year={2024},
	eprint={2408.12480},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2408.12480},
	}
	```