DeepMount00 commited on
Commit
6117df9
1 Parent(s): 8a83356

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -6
README.md CHANGED
@@ -13,12 +13,6 @@ This model represents the first version of an experimental sequence-to-sequence
13
  - **Primary Use**: This model is intended for use in processing and correcting Italian text that has been digitized using OCR technology. It is particularly useful for texts scanned at low quality, where the OCR's error rate is noticeably high.
14
  - **Users**: It is designed for developers, researchers, and archivists working with Italian historical documents, books, and any digitized material where OCR errors are prevalent.
15
 
16
- ## Training Data
17
- The model was trained on a diverse dataset of Italian texts, which includes a wide range of sources such as books, newspapers, and documents that have been digitized using various OCR systems. This dataset was specifically curated to include examples with common OCR errors observed in Italian texts, allowing the model to learn and correct these mistakes effectively.
18
-
19
- ## Model Architecture
20
- The model is based on a sequence-to-sequence framework, leveraging the latest advancements in natural language processing to understand and correct text at the character and word levels. It incorporates attention mechanisms to focus on error-prone areas in the text, ensuring high accuracy in the correction output.
21
-
22
  ## Limitations
23
  - While the model corrects approximately 93% of OCR errors, there may be certain types of errors or specific contexts where its performance could be lower.
24
  - The model is specifically trained on Italian text and may not perform well on texts in other languages or texts that include significant amounts of non-Italian languages.
 
13
  - **Primary Use**: This model is intended for use in processing and correcting Italian text that has been digitized using OCR technology. It is particularly useful for texts scanned at low quality, where the OCR's error rate is noticeably high.
14
  - **Users**: It is designed for developers, researchers, and archivists working with Italian historical documents, books, and any digitized material where OCR errors are prevalent.
15
 
 
 
 
 
 
 
16
  ## Limitations
17
  - While the model corrects approximately 93% of OCR errors, there may be certain types of errors or specific contexts where its performance could be lower.
18
  - The model is specifically trained on Italian text and may not perform well on texts in other languages or texts that include significant amounts of non-Italian languages.