PP-OCRv6 One Piece Bubble Line Recognition

Browser-ready OCR package for Projet Poneglyph. The pipeline detects text lines inside a manga bubble with YOLO26n, stitches the detected lines into one horizontal crop, decodes it with fine-tuned PP-OCRv6, then applies a conservative train-derived spacing/case postprocess from onnx/browser_manifest.json.

Current browser result

  • Hugging Face repo: Remidesbois/pp-ocrv6-one-piece-bubble-line-rec
  • Validation CER: 0.017155317360235393 (1.7155%)
  • Validation exact match: 0.7514693534844669 (75.15%)
  • Test CER: 0.014505395907584179 (1.4505%)
  • Test exact match: 0.7596390484003281 (75.96%)
  • Exported bubble images: 7821
  • Bubbles without detected lines: 22
  • Duplicate line-box bubbles: 0

These metrics are computed from the official 2026-06-29 release prediction CSVs after applying the browser postprocess rules trained only from the train split. The raw 2026-06-29 release baseline was validation 1.9204% CER / 71.62% exact and test 1.7097% CER / 71.21% exact.

Model and detector evidence

  • YOLO mAP50: 0.98917488275913
  • YOLO mAP50-95: 0.8715335950608416
  • Line detector mAP50: 0.995
  • Line detector mAP50-95: 0.8926965538059898
  • ONNX parity text match: true

Files

  • onnx/bubble_line_detector_yolo26n.onnx
  • onnx/ppocrv6_bubble_line_rec.onnx
  • onnx/ppocrv6_bubble_line_rec_webgpu.onnx
  • onnx/browser_manifest.json
  • onnx/pipeline_manifest.json
  • onnx/ppocrv6_postprocess_rules.json
  • postprocess_official_metrics.json
  • training_metrics.json
  • yolo_metrics.json
  • validation_predictions.csv
  • test_predictions.csv

The frontend model key is ppocrv6Line.

Downloads last month
55
Safetensors
Model size
19.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support