Kanji_ETL9G / README.md
LT8's picture
Update README.md
5688d2d
metadata
license: creativeml-openrail-m

Model Card for Kanji_ETL9G

Summary:

ETL9G
 : 607200 samples
 : 3036 classes (hiragana and kanji)
 : 200 samples each class
 : record_length: 8199 bytes
 : image_width: 64px
 : image_height: 64px

Model Details

  • Model Name: Kanji_ETL9G
  • Version: 1.0.0
  • Model Type: Neural Network
  • Framework: PyTorch

Model Description

This model is trained on a dataset derived from the ETL9G dataset to recognize Kanji characters from 64x64 grayscale images. The primary use-case is for optical character recognition (OCR) for handwritten Kanji characters.

Intended Use

The primary application of this model is for OCR tasks to recognize handwritten Kanji characters in images, with potential extensions for applications like smart dictionary lookup, handwriting-based user authentication, and so on.

Limitations

This model might have limitations regarding:

  • Variability in handwriting styles not present in the training set. (200 samples per character/class were used)
  • Noises and artifacts in input images.
  • Characters written in unconventional ways.

Data Details

Training Data:

  • Dataset: Derived from the ETL9G dataset (http://etlcdb.db.aist.go.jp/specification-of-etl-9)
  • Size: 607200 samples
  • Data Type: 64x64 grayscale images of handwritten Kanji characters (images were resized from 128x127 due to technical limitations)
  • Labels: 3036 unique characters (classes)

Model Files

  • PyTorch Model: Kanji_ETL9G.pth
  • ONNX Model: Kanji_ETL9G.onnx
  • CoreML Model: next effort....

Usage

import torch
model = torch.load('Kanji_ETL9G.pth')
model.eval()

# Assuming input image tensor is `input_tensor`
output = model(input_tensor)
predicted_label = torch.argmax(output).item()