|
--- |
|
license: apache-2.0 |
|
language: |
|
- zh |
|
- en |
|
pipeline_tag: image-to-text |
|
tags: |
|
- ocr |
|
- captcha |
|
--- |
|
|
|
|
|
|
|
## 介绍(Introduction) |
|
**验证码识别模型(ocr-captcha)**专门识别常见验证码的模型,训练模型有2个: |
|
|
|
1.**small**:训练数据大小为700MB,约8.4万张验证码图片,训练轮次27轮,最终的精度将近100%,**推荐下载这个模型**; |
|
|
|
2.**big**:训练数据大小为11G,约135万个验证码图片,训练轮次1轮,最终的精度将近93.95%(由于资源问题,无法训练太久); |
|
|
|
## 数据分布 |
|
|
|
1.**类型**:1. 纯数字型;2. 数字+字母型;3.纯字母型(大小写) |
|
|
|
2.**长度**:4位、5位、6位 |
|
|
|
## 数据微调 |
|
|
|
1.**基座模型**:基座模型参考达摩院发布的[读光-文字识别-行识别模型-中英-通用领域](https://www.modelscope.cn/models/damo/cv_convnextTiny_ocr-recognition-general_damo/summary) |
|
|
|
2.**具体微调参考以上链接** |
|
|
|
## 模型体验链接 |
|
|
|
modelscope:[验证码识别模型(ocr-captcha)](https://modelscope.cn/studios/xiaolv/ocr/summary) |
|
|
|
## 单独模型链接(modelscope) |
|
|
|
1.**[验证码识别模型(小)-small](https://modelscope.cn/models/xiaolv/ocr_small/summary)** |
|
|
|
2.**[验证码识别模型(大)-big](https://modelscope.cn/models/xiaolv/ocr_big/summary)** |
|
|
|
|
|
## 快速使用(Quickstart) |
|
|
|
代码提供web网页版:```myself_train_model.py``` |
|
|
|
```python |
|
from modelscope.pipelines import pipeline |
|
from modelscope.utils.constant import Tasks |
|
import gradio as gr |
|
import os |
|
|
|
|
|
class xiaolv_ocr_model(): |
|
|
|
def __init__(self): |
|
model_small = r"./output_small" |
|
model_big = r"./output_big" |
|
self.ocr_recognition_small = pipeline(Tasks.ocr_recognition, model=model_small) |
|
self.ocr_recognition1_big = pipeline(Tasks.ocr_recognition, model=model_big) |
|
|
|
|
|
def run(self,pict_path,moshi = "small", context=[]): |
|
pict_path = pict_path.name |
|
context = [pict_path] |
|
|
|
if moshi == "small": |
|
result = self.ocr_recognition_small(pict_path) |
|
else: |
|
result = self.ocr_recognition1_big(pict_path) |
|
|
|
context += [str(result['text'][0])] |
|
responses = [(u, b) for u, b in zip(context[::2], context[1::2])] |
|
print(f"识别的结果为:{result}") |
|
os.remove(pict_path) |
|
return responses,context |
|
|
|
|
|
|
|
|
|
if __name__ == "__main__": |
|
pict_path = r"C:\Users\admin\Desktop\图片识别测试\企业微信截图_16895911221007.png" |
|
ocr_model = xiaolv_ocr_model() |
|
# ocr_model.run(pict_path) |
|
``` |
|
|
|
|
|
## 联系我们(Contact Us) |
|
|
|
如果你想给我们的研发团队和产品团队留言,请通过邮件(2240560729@qq.com)联系我们。 |
|
|
|
|
|
|
|
|