Update README
Browse files
README.md
CHANGED
|
@@ -7,8 +7,51 @@ pipeline_tag: image-text-to-text
|
|
| 7 |
library_name: transformers
|
| 8 |
---
|
| 9 |
|
| 10 |
-
MinerU2.5 (Pre-release)
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
library_name: transformers
|
| 8 |
---
|
| 9 |
|
|
|
|
| 10 |
|
| 11 |
+
# MinerU2.5
|
| 12 |
|
| 13 |
+
We are releasing MinerU2.5, a 1.2B-parameter visual-language model specialized in OCR and document parsing, enabling more accurate and robust parsing of complex and diverse real-world documents.
|
| 14 |
+
|
| 15 |
+
The model weights are stable and available for use, primarily intended for internal development and demonstration purposes.
|
| 16 |
+
|
| 17 |
+
> ⚠️ A full technical report, source code, and a comprehensive README will be released later this month. Stay tuned!
|
| 18 |
+
|
| 19 |
+
## Quick Start
|
| 20 |
+
For convenience, we provide a python package named `mineru-vl-utils` to smoothly use the MinerU2.5 Vision-Language Model. For more information and usages, please refer to [mineru-vl-utils](https://github.com/opendatalab/mineru-vl-utils/tree/main).
|
| 21 |
+
|
| 22 |
+
Here we give a simple example to use MinerU2.5 with 🤗 Transformers.
|
| 23 |
+
|
| 24 |
+
### Install packages
|
| 25 |
+
``` bash
|
| 26 |
+
pip install mineru-vl-utils[transformers]
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
### Run with Transformers
|
| 30 |
+
``` python
|
| 31 |
+
from transformers import AutoProcessor, Qwen2VLForConditionalGeneration
|
| 32 |
+
from PIL import Image
|
| 33 |
+
from mineru_vl_utils import MinerUClient
|
| 34 |
+
|
| 35 |
+
model_path = "opendatalab/MinerU2.5-2509-1.2B"
|
| 36 |
+
|
| 37 |
+
model = Qwen2VLForConditionalGeneration.from_pretrained(
|
| 38 |
+
model_path,
|
| 39 |
+
dtype="auto",
|
| 40 |
+
device_map="auto"
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
processor = AutoProcessor.from_pretrained(
|
| 44 |
+
model_path,
|
| 45 |
+
use_fast=True
|
| 46 |
+
)
|
| 47 |
+
|
| 48 |
+
client = MinerUClient(
|
| 49 |
+
backend="transformers",
|
| 50 |
+
model=model,
|
| 51 |
+
processor=processor
|
| 52 |
+
)
|
| 53 |
+
|
| 54 |
+
image_path = '/path/to/your/image'
|
| 55 |
+
image = Image.open(image_path)
|
| 56 |
+
extracted_blocks = client.two_step_extract(image)
|
| 57 |
+
```
|