Instructions to use JustANormalTinkerer/manga-ocr-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JustANormalTinkerer/manga-ocr-finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="JustANormalTinkerer/manga-ocr-finetuned")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("JustANormalTinkerer/manga-ocr-finetuned") model = AutoModelForImageTextToText.from_pretrained("JustANormalTinkerer/manga-ocr-finetuned") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use JustANormalTinkerer/manga-ocr-finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "JustANormalTinkerer/manga-ocr-finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JustANormalTinkerer/manga-ocr-finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/JustANormalTinkerer/manga-ocr-finetuned
- SGLang
How to use JustANormalTinkerer/manga-ocr-finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "JustANormalTinkerer/manga-ocr-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JustANormalTinkerer/manga-ocr-finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "JustANormalTinkerer/manga-ocr-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JustANormalTinkerer/manga-ocr-finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use JustANormalTinkerer/manga-ocr-finetuned with Docker Model Runner:
docker model run hf.co/JustANormalTinkerer/manga-ocr-finetuned
I've released https://github.com/NopeNopeGuy/hayai-ocr please use that instead as it performs MUCH better. Note that model is actually still undertrained since I ran out of kaggle hours so may get updated to be MUCH better.
manga-ocr-finetuned
This model is a fine-tuned version of jzhang533/manga-ocr-base-2025
This is the evaluation of the manga-ocr models on the Evaluation set:
| Model Name | Full Eval Set CER (%) |
|---|---|
| kha-white/manga-ocr-base | 37.45% |
| jzhang533/manga-ocr-base-2025 | 37.38% |
| JustANormalTinkerer/manga-ocr-finetuned (old) | 26.25% |
| JustANormalTinkerer/manga-ocr-finetuned (new) | 15% |
Intended uses & limitations
For manga ocr with better english and SFX support (it's still bad, but less bad)
Training and evaluation data
The model was trained on a private dataset consisting of modern ENGLISH translated manga's like Kaguya-sama - Love Is War, Komi Can't Communicate, Nichijou, Akuyaku Reijou no Naka no Hito, Solo Leveling, Witch Hat Atelier (~6k image crops). It was also trained on a subset of the AnimeText dataset consisting of ~ 100k Japanese image crops and 10k English image crops and was also trained on 45k image crops of Manga109-s COO.
It was further trained on a private dataset consisting of some volumes of the Shonen Jump, Young Jump, Made In Abyss, Stand By Me, Shonen Jump, Choukadou Girl, Tsurezure Children, La Vie en Doll, Shadows House, Komi-san, Yuri Hime, Isekai Ojisan, Kaguya-sama and some Indie Mangas from Pixiv, making up ~80k image crops that were psuedo-labeled by gemini-3.1-flash-lite. It was trained for 3 epochs on this data.
Framework versions
- Transformers 5.0.0
- Pytorch 2.10.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.2
Citations
@inproceedings{baek2026mangav26,
author = {Baek, Jeonghun and Miyai, Atsuyuki and Onohara, Shota and Ikuta, Hikaru and Aizawa, Kiyoharu},
title = {Revisiting Manga109 Annotations for Modern Manga Understanding},
booktitle = {Culture × AI Workshop at ICML 2026},
year = {2026}
}
@article{aizawa2020building,
author = {Aizawa, Kiyoharu and Fujimoto, Azuma and Otsubo, Atsushi and Ogawa, Toru and Matsui, Yusuke and Tsubota, Koki and Ikuta, Hikaru},
title = {Building a Manga Dataset ``Manga109'' with Annotations for Multimedia Applications},
journal = {IEEE MultiMedia},
year = {2020},
volume = {27},
number = {2},
pages = {8--18},
doi = {10.1109/mmul.2020.2987895}
}
@article{matsui2017sketch,
author = {Matsui, Yusuke and Ito, Kota and Aramaki, Yuji and Fujimoto, Azuma and Ogawa, Toru and Yamasaki, Toshihiko and Aizawa, Kiyoharu},
title = {Sketch-based Manga Retrieval using Manga109 Dataset},
journal = {Multimedia Tools and Applications},
year = {2017},
volume = {76},
number = {20},
pages = {21811--21838},
doi = {10.1007/s11042-016-4020-z}
}
@inproceedings{baek2022coo,
author = {Baek, Jeonghun and Matsui, Yusuke and Aizawa, Kiyoharu},
title = {COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022}
}
- Downloads last month
- 300