Nayana-cognitivelab
/

Nayana_base_OCR

@@ -11,23 +11,109 @@ tags:
 license: apache-2.0
 ---
-Nayana_base_combined_v1
 ```python
 from transformers import AutoModel, AutoTokenizer
-from peft import PeftModel, PeftConfig, AutoPeftModelForCausalLM
-from transformers import AutoModelForCausalLM
 import torch
-tokenizer = AutoTokenizer.from_pretrained('v1v1d/Nayana_base_combined_lora_64', trust_remote_code=True , torch_dtype=torch.float16)
-model = AutoModel.from_pretrained('v1v1d/Nayana_base_combined_lora_64', trust_remote_code=True, low_cpu_mem_usage=True, device_map='cuda', use_safetensors=True, pad_token_id=tokenizer.eos_token_id , torch_dtype=torch.float16)
 model = model.eval().cuda()
 image_file = 'hindi.png'
-res = model.chat(tokenizer, image_file, ocr_type='ocr' , render=True, stream_flag = True)
-print(res)
-```

 license: apache-2.0
 ---
+# Nayana OCR(Alpha)
+Nayana OCR is a state-of-the-art model finetuned for document-level Optical Character Recognition (OCR) across **10 Indian languages**:
+**Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, Telugu**
+while maintaining exceptional OCR capabilities in **English** and **Chinese**.
+This model is built upon the robust **GOT OCR** base and offers features like advanced multilingual OCR, enhanced document rendering, and seamless GPU utilization.
+We are training a better model with lot more data follows us to keep it update
+for more information : [Cognitivelab](https://cognitivelab.in)
+---
+## Key Features
+- **Multilingual OCR**: Supports OCR for 10 Indian languages alongside English and Chinese.
+- **Document-Level OCR**: Designed for extracting text from complex document layouts.
+- **Streamlined Deployment**: Optimized for GPU usage with support for safetensors.
+- **Customizable OCR Type**: Switch between OCR modes and enable rendering.
+---
+## Installation
+To use Nayana OCR, ensure you have the following prerequisites installed:
+1. Python 3.8+
+2. PyTorch (with GPU support)
+3. Transformers library
+4. PEFT library
+Install the required libraries using:
+```bash
+pip install torch transformers peft
+```
+---
+## Usage Example
+Here's a quick example of how to use Nayana OCR for extracting text from an image:
 ```python
 from transformers import AutoModel, AutoTokenizer
+from peft import PeftModel
 import torch
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained(
+    'Nayana-cognitivelab/Nayana_base_OCR',
+    trust_remote_code=True,
+    torch_dtype=torch.float16
+)
+model = AutoModel.from_pretrained(
+    'Nayana-cognitivelab/Nayana_base_OCR',
+    trust_remote_code=True,
+    low_cpu_mem_usage=True,
+    device_map='cuda',
+    use_safetensors=True,
+    pad_token_id=tokenizer.eos_token_id,
+    torch_dtype=torch.float16
+)
+# Prepare the model for inference
 model = model.eval().cuda()
+# Perform OCR on an image
 image_file = 'hindi.png'
+result = model.chat(
+    tokenizer,
+    image_file,
+    ocr_type='ocr',
+    render=True,
+    stream_flag=True
+)
+print(result)
+```
+---
+## Parameters
+| Parameter    | Description                                                                 | Default  |
+|--------------|-----------------------------------------------------------------------------|----------|
+| `ocr_type`   | Specify the type of OCR to use (`'ocr'`)                                    | `'ocr'`  |
+| `render`     | Enable rendering of the extracted text on the image.                        | `True`   |
+| `stream_flag`| Stream results for larger or multi-page documents.                          | `True`   |
+---
+## Base Model
+This model is finetuned on the **GOT OCR** base, leveraging its vision-language capabilities to deliver unparalleled OCR performance.
+---
+## License
+This project is licensed under the **Apache 2.0 License**. See the [LICENSE](LICENSE) file for details.
+---