--- license: apache-2.0 datasets: - naver-clova-ix/synthdog-ko language: - ko - en base_model: - Qwen/Qwen2-VL-7B-Instruct tags: - OCR - Korea - Korean --- # developer0hye/synthdog-koQwen2-VL-7B-Instruct - [Training Code - developer0hye/synthdog-koQwen2-VL-7B-Instruct ](https://github.com/developer0hye/synthdog-koQwen2-VL-7B-Instruct) - [naver-clova-ix/synthdog-ko dataset](https://huggingface.co/datasets/naver-clova-ix/synthdog-ko) was used to teach the model the order in which Korean sentences should be read and how to recognize Korean characters. ![synthdog-ko image/png](https://cdn-uploads.huggingface.co/production/uploads/670181c33c2e742ab844b904/0EidvPMU3IEugrXaU4VMQ.png) - Finetune Qwen2-VL-7B-Instruct model from this weights for Korean OCR with real image datasets such as [developer0hye/korocr](https://huggingface.co/datasets/developer0hye/korocr) ![korocr image/png](https://cdn-uploads.huggingface.co/production/uploads/670181c33c2e742ab844b904/HbpicUjH4mdigAIUAqI3m.png) - Hmm... Honestly, I'm not sure if this model has the potential to be a good Korean OCR model. But let's try fine-tuning it anyway. If you get good results, email me at developer.0hye@gmail.com 😄 - # Quickstart ```python ```