finetuned-MS-swin-doc-text-classifer

This model is a fine-tuned version of Microsoft’s Swin Transformer tiny-sized model microsoft/swin-tiny-patch4-window7-224 on the ernie-ai/image-text-examples-ar-cn-latin-notext dataset. It achieves the following results on the evaluation set:

  • Loss: 0.267
  • Accuracy: 0.882

Model description

It is an image classificatin model fine-tuned to predict whether an images contains text and if that text is Latin script, Chinese or Arabic. It also classifies non-text images.

Training and evaluation data

Dataset: [ernie-ai/image-text-examples-ar-cn-latin-notext]

Model Trained Using AutoTrain

  • Problem type: Multi-class Classification
  • Model ID: 3338392240
  • CO2 Emissions (in grams): 2.2267

Validation Metrics

  • Loss: 0.267
  • Accuracy: 0.882
  • Macro F1: 0.862
  • Micro F1: 0.882
  • Weighted F1: 0.880
  • Macro Precision: 0.877
  • Micro Precision: 0.882
  • Weighted Precision: 0.883
  • Macro Recall: 0.856
  • Micro Recall: 0.882
  • Weighted Recall: 0.882
Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.