sbintuitions
/

sarashina2.2-ocr

sarashina2_vision

text-generation

document-understanding

vision-language

Model card Files Files and versions

tkmtakada-sbint commited on Mar 31

Commit

eafb8d4

·

verified ·

1 Parent(s): 7d9d23f

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -91,7 +91,7 @@ VJRODa evaluates OCR capabilities for Japanese documents, particularly focusing
 | Model | CER(↓) | BLEU(↑) |
 | - | - | - |
 | gpt-5-mini-2025-08-07 | 72.4 | 23.6 |
-| Qwen3.5-VL-4B-Instruct | 86.1 | 47.8 |
 | KARAKURI VL 32B Instruct 2507 | 280 | 14.1 |
 | LightOnOCR-2-1B | 158 | 28.9 |
 | dots.ocr | 40.1 | 71.5 |
@@ -268,7 +268,7 @@ The following image visualizes the output bounding boxes in red:
 ```
 @misc{sarashinaOCR2026,
   title  = {Sarashina2.2-OCR: End-to-end OCR Model for Japanese Document Parsing},
-  author = {Takumi Takada and Toshiyuki Tanaka and Kohei Uehara and Mikihiro Tanaka and Alexis Vallet and Aman Jain},
   year   = {2026},
   url    = {https://huggingface.co/sbintuitions/sarashina2.2-ocr}
 }

 | Model | CER(↓) | BLEU(↑) |
 | - | - | - |
 | gpt-5-mini-2025-08-07 | 72.4 | 23.6 |
+| Qwen3.5-4B(non-thinking) | 86.1 | 47.8 |
 | KARAKURI VL 32B Instruct 2507 | 280 | 14.1 |
 | LightOnOCR-2-1B | 158 | 28.9 |
 | dots.ocr | 40.1 | 71.5 |
 ```
 @misc{sarashinaOCR2026,
   title  = {Sarashina2.2-OCR: End-to-end OCR Model for Japanese Document Parsing},
+  author = {Takumi Takada and Toshiyuki Tanaka and Kohei Uehara and Mikihiro Tanaka and Alexis Vallet and Aman Jain and Ryuichiro Hataya and Seitaro Shinagawa and Yuto Imai and Teppei Suzuki},
   year   = {2026},
   url    = {https://huggingface.co/sbintuitions/sarashina2.2-ocr}
 }