pranshh commited on
Commit
070515b
β€’
1 Parent(s): d04279c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -11
README.md CHANGED
@@ -1,12 +1,68 @@
1
- ---
2
- title: Ocr Assignment
3
- emoji: πŸ‘€
4
- colorFrom: pink
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 4.44.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
2
+
3
+ # Hindi & English OCR with Keyword Search
4
+
5
+ This project implements a web-based prototype for Optical Character Recognition (OCR) on images containing text in both Hindi and English. It also includes a basic keyword search functionality based on the extracted text.
6
+
7
+ ## Features
8
+
9
+ - Upload and process images containing Hindi and English text
10
+ - Extract text from images using OCR
11
+ - Perform keyword search on the extracted text
12
+ - Web-based interface for easy interaction
13
+
14
+ ## Technology Stack
15
+
16
+ - Python
17
+ - Hugging Face Transformers (Qwen2-VL-2B-Instruct model)
18
+ - PyTorch
19
+ - Gradio (for web interface)
20
+
21
+ ## Setup and Installation
22
+
23
+ 1. Clone the repository:
24
+ ```
25
+ git clone [your-repo-url]
26
+ cd [your-repo-name]
27
+ ```
28
+
29
+ 2. Install the required dependencies:
30
+ ```
31
+ pip install transformers torch gradio Pillow
32
+ ```
33
+
34
+ 3. Download the Qwen2-VL-2B-Instruct model:
35
+ The model will be automatically downloaded when you run the application for the first time.
36
+
37
+ ## Usage
38
+
39
+ 1. Run the application:
40
+ ```
41
+ python app.py
42
+ ```
43
+
44
+ 2. Open the provided URL in your web browser.
45
+
46
+ 3. Upload an image containing Hindi and/or English text.
47
+
48
+ 4. (Optional) Enter a keyword to search within the extracted text.
49
+
50
+ 5. View the OCR results and any keyword matches.
51
+
52
+ ## Limitations
53
+
54
+ - The current implementation uses CPU for processing, which may be slower for large images.
55
+
56
+ ## Future Improvements
57
+
58
+ - Implement GPU support for faster processing
59
+ - Add support for multiple image uploads
60
+ - Enhance the user interface for better user experience
61
+
62
+ ## Link
63
+
64
+ https://huggingface.co/spaces/pranshh/ocr-assignment
65
+
66
+ ## Acknowledgements
67
+
68
+ This project uses the Qwen2-VL-2B-Instruct model from Hugging Face Transformers.