DunnBC22
/

trocr-large-printed-e13b_tesseract_MICR_ocr

Image-Text-to-Text

vision-encoder-decoder

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

DunnBC22 commited on Jul 28, 2023

Commit

980f798

·

1 Parent(s): 9e58f8a

Update README.md

Files changed (1) hide show

README.md +19 -10

README.md CHANGED Viewed

@@ -1,32 +1,41 @@
 ---
 tags:
 - generated_from_trainer
 model-index:
 - name: trocr-large-printed-e13b_tesseract_MICR_ocr
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # trocr-large-printed-e13b_tesseract_MICR_ocr
-This model is a fine-tuned version of [microsoft/trocr-large-printed](https://huggingface.co/microsoft/trocr-large-printed) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.2432
-- Cer: 0.0036
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -43,7 +52,7 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Cer    |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
 | 0.486         | 1.0   | 841  | 0.5168          | 0.0428 |
 | 0.2187        | 2.0   | 1682 | 0.2432          | 0.0036 |
@@ -54,4 +63,4 @@ The following hyperparameters were used during training:
 - Transformers 4.28.1
 - Pytorch 2.0.1
 - Datasets 2.13.1
-- Tokenizers 0.13.3

 ---
 tags:
 - generated_from_trainer
+- TrOCR
 model-index:
 - name: trocr-large-printed-e13b_tesseract_MICR_ocr
   results: []
+license: bsd-3-clause
+language:
+- en
+metrics:
+- cer
 ---
 # trocr-large-printed-e13b_tesseract_MICR_ocr
+This model is a fine-tuned version of [microsoft/trocr-large-printed](https://huggingface.co/microsoft/trocr-large-printed).
 It achieves the following results on the evaluation set:
 - Loss: 0.2432
+- CER: 0.0036
 ## Model description
+For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/blob/main/Optical%20Character%20Recognition%20(OCR)/Tesseract%20MICR%20(E15B%20Dataset)/TrOCR-e13b%20-%20tesseractMICR.ipynb
 ## Intended uses & limitations
+This model is intended to demonstrate my ability to solve a complex problem using technology.
 ## Training and evaluation data
+Dataset Source: https://github.com/DoubangoTelecom/tesseractMICR/tree/master/datasets/e13b
+__Histogram of Label Character Lengths__
+![Histogram of Label Character Lengths](https://raw.githubusercontent.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/main/Optical%20Character%20Recognition%20(OCR)/Tesseract%20MICR%20(E15B%20Dataset)/Images/Histogram%20of%20Label%20Character%20Length.png)
 ## Training procedure
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | CER    |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
 | 0.486         | 1.0   | 841  | 0.5168          | 0.0428 |
 | 0.2187        | 2.0   | 1682 | 0.2432          | 0.0036 |
 - Transformers 4.28.1
 - Pytorch 2.0.1
 - Datasets 2.13.1
+- Tokenizers 0.13.3