---
license: apache-2.0
datasets:
- naver-clova-ix/synthdog-ko
language:
- ko
- en
base_model:
- Qwen/Qwen2-VL-7B-Instruct
tags:
- OCR
- Korea
- Korean
---

# developer0hye/synthdog-koQwen2-VL-7B-Instruct

- [Training Code - developer0hye/synthdog-koQwen2-VL-7B-Instruct ](https://github.com/developer0hye/synthdog-koQwen2-VL-7B-Instruct)
- [naver-clova-ix/synthdog-ko dataset](https://huggingface.co/datasets/naver-clova-ix/synthdog-ko) was used to teach the model the order in which Korean sentences should be read and how to recognize Korean characters.

![synthdog-ko image/png](https://cdn-uploads.huggingface.co/production/uploads/670181c33c2e742ab844b904/0EidvPMU3IEugrXaU4VMQ.png)

- Finetune Qwen2-VL-7B-Instruct model from this weights for Korean OCR with real image datasets such as [developer0hye/korocr](https://huggingface.co/datasets/developer0hye/korocr)

![korocr image/png](https://cdn-uploads.huggingface.co/production/uploads/670181c33c2e742ab844b904/HbpicUjH4mdigAIUAqI3m.png)

- Hmm... Honestly, I'm not sure if this model has the potential to be a good Korean OCR model. But let's try fine-tuning it anyway. If you get good results, email me at developer.0hye@gmail.com 😄

- 
# Quickstart

```python

```