--- language: - ko pipeline_tag: image-to-text --- # **deplot_kr** deplot_kr is a Image-to-Data(Text) model based on the google's pix2struct architecture. It was fine-tuned from [DePlot](https://huggingface.co/google/deplot), using korean chart image-text pairs. deplot_kr은 google의 pix2struct 구조를 기반으로 한 한국어 image-to-data(텍스트 형태의 데이터 테이블) 모델입니다. [DePlot](https://huggingface.co/google/deplot) 모델을 한국어 차트 이미지-텍스트 쌍 데이터세트(30만 개)를 이용하여 fine-tuning 했습니다. ## **How to use** You can run a prediction by input an image. Model predict the data table of text form in the image. 이미지를 모델에 입력하면 모델은 이미지로부터 표 형태의 데이터 테이블을 예측합니다. ```python from transformers import Pix2StructForConditionalGeneration, AutoProcessor from PIL import Image processor = AutoProcessor.from_pretrained("brainventures/deplot_kr") model = Pix2StructForConditionalGeneration.from_pretrained("brainventures/deplot_kr") image_path = "IMAGE_PATH" image = Image.open(image_path) inputs = processor(images=image, return_tensors="pt") pred = model.generate(flattened_patches=flattened_patches, attention_mask=attention_mask, max_length=1024) print(processor.batch_decode(deplot_generated_ids, skip_special_token=True)[0]) ``` **Model Input Image** ![model_input_image](./sample.jpg) **Model Output - Prediction** 대상: 제목: 2011-2021 보건복지 분야 일자리의 증 유형: 단일형 일반 세로 대형 | 보건(천 명) | 복지(천 명) 1분위 | 29.7 | 178.4 2분위 | 70.8 | 97.3 3분위 | 86.4 | 61.3 4분위 | 28.2 | 16.0 5분위 | 52.3 | 0.9 ### **Preprocessing** According to [Liu et al.(2023)](https://arxiv.org/pdf/2212.10505.pdf)... - markdown format - | : seperating cells (열 구분) - \n : seperating rows (행 구분) ### **Train** The model was trained in a TPU environment. - num_warmup_steps : 1,000 - num_training_steps : 40,000 ## **Evaluation Results** This model achieves the following results: |metrics name | % | |:---|---:| | RNSS (Relative Number Set Similarity)| 98.1615 | |RMS (Relative Mapping Similarity) Precision | 83.1615 | |RMS Recall | 26.3549 | | RMS F1 Score | 31.5633 | ## Contact For questions and comments, please use the discussion tab or email gloria@brainventur.com