Commit
·
9cba22c
1
Parent(s):
98d3e21
Update readme and add license (#1018)
Browse files### What problem does this PR solve?
- Update readme
- Add license
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
- README.md +2 -1
- README_ja.md +14 -12
- README_zh.md +14 -10
- deepdoc/parser/__init__.py +12 -1
- deepdoc/parser/docx_parser.py +13 -1
- deepdoc/parser/excel_parser.py +13 -1
- deepdoc/parser/pdf_parser.py +13 -1
- deepdoc/parser/ppt_parser.py +1 -0
- deepdoc/parser/resume/__init__.py +13 -0
- deepdoc/parser/resume/entities/corporations.py +13 -0
- deepdoc/parser/resume/entities/degrees.py +13 -0
- deepdoc/parser/resume/entities/industries.py +12 -0
- deepdoc/parser/resume/entities/regions.py +13 -0
- deepdoc/parser/resume/entities/schools.py +13 -1
- deepdoc/parser/resume/step_one.py +13 -1
- deepdoc/parser/resume/step_two.py +13 -1
- deepdoc/vision/__init__.py +13 -0
- deepdoc/vision/postprocess.py +13 -0
README.md
CHANGED
@@ -180,7 +180,7 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
|
|
180 |
> With default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (**sans** port number) as the default HTTP serving port `80` can be omitted when using the default configurations.
|
181 |
6. In [service_conf.yaml](./docker/service_conf.yaml), select the desired LLM factory in `user_default_llm` and update the `API_KEY` field with the corresponding API key.
|
182 |
|
183 |
-
> See [
|
184 |
|
185 |
_The show is now on!_
|
186 |
|
@@ -326,6 +326,7 @@ See the [RAGFlow Roadmap 2024](https://github.com/infiniflow/ragflow/issues/162)
|
|
326 |
|
327 |
- [Discord](https://discord.gg/4XxujFgUN7)
|
328 |
- [Twitter](https://twitter.com/infiniflowai)
|
|
|
329 |
|
330 |
## 🙌 Contributing
|
331 |
|
|
|
180 |
> With default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (**sans** port number) as the default HTTP serving port `80` can be omitted when using the default configurations.
|
181 |
6. In [service_conf.yaml](./docker/service_conf.yaml), select the desired LLM factory in `user_default_llm` and update the `API_KEY` field with the corresponding API key.
|
182 |
|
183 |
+
> See [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) for more information.
|
184 |
|
185 |
_The show is now on!_
|
186 |
|
|
|
326 |
|
327 |
- [Discord](https://discord.gg/4XxujFgUN7)
|
328 |
- [Twitter](https://twitter.com/infiniflowai)
|
329 |
+
- [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)
|
330 |
|
331 |
## 🙌 Contributing
|
332 |
|
README_ja.md
CHANGED
@@ -24,6 +24,14 @@
|
|
24 |
</a>
|
25 |
</p>
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
## 💡 RAGFlow とは?
|
28 |
|
29 |
[RAGFlow](https://ragflow.io/) は、深い文書理解に基づいたオープンソースの RAG (Retrieval-Augmented Generation) エンジンである。LLM(大規模言語モデル)を組み合わせることで、様々な複雑なフォーマットのデータから根拠のある引用に裏打ちされた、信頼できる質問応答機能を実現し、あらゆる規模のビジネスに適した RAG ワークフローを提供します。
|
@@ -40,15 +48,6 @@
|
|
40 |
- 2024-05-21 ストリーミング出力とテキストチャンク取得APIをサポート。
|
41 |
- 2024-05-15 OpenAI GPT-4oを統合しました。
|
42 |
- 2024-05-08 LLM DeepSeek-V2を統合しました。
|
43 |
-
- 2024-04-26 「ファイル管理」機能を追加しました。
|
44 |
-
- 2024-04-19 会話 API をサポートします ([詳細](./docs/references/api.md))。
|
45 |
-
- 2024-04-16 [BCEmbedding](https://github.com/netease-youdao/BCEmbedding) から埋め込みモデル「bce-embedding-base_v1」を追加します。
|
46 |
-
- 2024-04-16 [FastEmbed](https://github.com/qdrant/fastembed) は、軽量かつ高速な埋め込み用に設計されています。
|
47 |
-
- 2024-04-11 ローカル LLM デプロイメント用に [Xinference](./docs/guides/deploy_local_llm.md) をサポートします。
|
48 |
-
- 2024-04-10 メソッド「Laws」に新しいレイアウト認識モデルを追加します。
|
49 |
-
- 2024-04-08 [Ollama](./docs/guides/deploy_local_llm.md) を使用した大規模モデルのローカライズされたデプロイメントをサポートします。
|
50 |
-
- 2024-04-07 中国語インターフェースをサポートします。
|
51 |
-
|
52 |
|
53 |
## 🌟 主な特徴
|
54 |
|
@@ -162,7 +161,7 @@
|
|
162 |
> デフォルトの設定を使用する場合、デフォルトの HTTP サービングポート `80` は省略できるので、与えられたシナリオでは、`http://IP_OF_YOUR_MACHINE`(ポート番号は省略)だけを入力すればよい。
|
163 |
6. [service_conf.yaml](./docker/service_conf.yaml) で、`user_default_llm` で希望の LLM ファクトリを選択し、`API_KEY` フィールドを対応する API キーで更新する。
|
164 |
|
165 |
-
> 詳しくは [
|
166 |
|
167 |
_これで初期設定完了!ショーの開幕です!_
|
168 |
|
@@ -261,8 +260,10 @@ $ bash ./entrypoint.sh
|
|
261 |
|
262 |
## 📚 ドキュメンテーション
|
263 |
|
264 |
-
- [Quickstart](
|
265 |
-
- [
|
|
|
|
|
266 |
|
267 |
## 📜 ロードマップ
|
268 |
|
@@ -272,6 +273,7 @@ $ bash ./entrypoint.sh
|
|
272 |
|
273 |
- [Discord](https://discord.gg/4XxujFgUN7)
|
274 |
- [Twitter](https://twitter.com/infiniflowai)
|
|
|
275 |
|
276 |
## 🙌 コントリビュート
|
277 |
|
|
|
24 |
</a>
|
25 |
</p>
|
26 |
|
27 |
+
<h4 align="center">
|
28 |
+
<a href="https://ragflow.io/docs/dev/">Document</a> |
|
29 |
+
<a href="https://github.com/infiniflow/ragflow/issues/162">Roadmap</a> |
|
30 |
+
<a href="https://twitter.com/infiniflowai">Twitter</a> |
|
31 |
+
<a href="https://discord.gg/jEfRUwEYEV">Discord</a> |
|
32 |
+
<a href="https://demo.ragflow.io">Demo</a>
|
33 |
+
</h4>
|
34 |
+
|
35 |
## 💡 RAGFlow とは?
|
36 |
|
37 |
[RAGFlow](https://ragflow.io/) は、深い文書理解に基づいたオープンソースの RAG (Retrieval-Augmented Generation) エンジンである。LLM(大規模言語モデル)を組み合わせることで、様々な複雑なフォーマットのデータから根拠のある引用に裏打ちされた、信頼できる質問応答機能を実現し、あらゆる規模のビジネスに適した RAG ワークフローを提供します。
|
|
|
48 |
- 2024-05-21 ストリーミング出力とテキストチャンク取得APIをサポート。
|
49 |
- 2024-05-15 OpenAI GPT-4oを統合しました。
|
50 |
- 2024-05-08 LLM DeepSeek-V2を統合しました。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
|
52 |
## 🌟 主な特徴
|
53 |
|
|
|
161 |
> デフォルトの設定を使用する場合、デフォルトの HTTP サービングポート `80` は省略できるので、与えられたシナリオでは、`http://IP_OF_YOUR_MACHINE`(ポート番号は省略)だけを入力すればよい。
|
162 |
6. [service_conf.yaml](./docker/service_conf.yaml) で、`user_default_llm` で希望の LLM ファクトリを選択し、`API_KEY` フィールドを対応する API キーで更新する。
|
163 |
|
164 |
+
> 詳しくは [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) を参照してください。
|
165 |
|
166 |
_これで初期設定完了!ショーの開幕です!_
|
167 |
|
|
|
260 |
|
261 |
## 📚 ドキュメンテーション
|
262 |
|
263 |
+
- [Quickstart](https://ragflow.io/docs/dev/)
|
264 |
+
- [User guide](https://ragflow.io/docs/dev/category/user-guides)
|
265 |
+
- [Reference](https://ragflow.io/docs/dev/category/references)
|
266 |
+
- [FAQ](https://ragflow.io/docs/dev/faq)
|
267 |
|
268 |
## 📜 ロードマップ
|
269 |
|
|
|
273 |
|
274 |
- [Discord](https://discord.gg/4XxujFgUN7)
|
275 |
- [Twitter](https://twitter.com/infiniflowai)
|
276 |
+
- [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)
|
277 |
|
278 |
## 🙌 コントリビュート
|
279 |
|
README_zh.md
CHANGED
@@ -23,6 +23,14 @@
|
|
23 |
</a>
|
24 |
</p>
|
25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
## 💡 RAGFlow 是什么?
|
27 |
|
28 |
[RAGFlow](https://ragflow.io/) 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎。RAGFlow 可以为各种规模的企业及个人提供一套精简的 RAG 工作流程,结合大语言模型(LLM)针对用户各类不同的复杂格式数据提供可靠的问答以及有理有据的引用。
|
@@ -39,13 +47,6 @@
|
|
39 |
- 2024-05-21 支持流式结果输出和文本块获取API。
|
40 |
- 2024-05-15 集成大模型 OpenAI GPT-4o。
|
41 |
- 2024-05-08 集成大模型 DeepSeek。
|
42 |
-
- 2024-04-26 增添了'文件管理'功能。
|
43 |
-
- 2024-04-19 支持对话 API ([更多](./docs/references/api.md))。
|
44 |
-
- 2024-04-16 集成嵌入模型 [BCEmbedding](https://github.com/netease-youdao/BCEmbedding) 和 专为轻型和高速嵌入而设计的 [FastEmbed](https://github.com/qdrant/fastembed)。
|
45 |
-
- 2024-04-11 支持用 [Xinference](./docs/guides/deploy_local_llm.md) 本地化部署大模型。
|
46 |
-
- 2024-04-10 为‘Laws’版面分析增加了底层模型。
|
47 |
-
- 2024-04-08 支持用 [Ollama](./docs/guides/deploy_local_llm.md) 本地化部署大模型。
|
48 |
-
- 2024-04-07 支持中文界面。
|
49 |
|
50 |
## 🌟 主要功能
|
51 |
|
@@ -159,7 +160,7 @@
|
|
159 |
> 上面这个例子中,您只需输入 http://IP_OF_YOUR_MACHINE 即可:未改动过配置则无需输入端口(默认的 HTTP 服务端口 80)。
|
160 |
6. 在 [service_conf.yaml](./docker/service_conf.yaml) 文件的 `user_default_llm` 栏配置 LLM factory,并在 `API_KEY` 栏填写和你选择的大模型相对应的 API key。
|
161 |
|
162 |
-
> 详见 [
|
163 |
|
164 |
_好戏开始,接着奏乐接着舞!_
|
165 |
|
@@ -279,8 +280,10 @@ $ systemctl start nginx
|
|
279 |
```
|
280 |
## 📚 技术文档
|
281 |
|
282 |
-
- [Quickstart](
|
283 |
-
- [
|
|
|
|
|
284 |
|
285 |
## 📜 路线图
|
286 |
|
@@ -290,6 +293,7 @@ $ systemctl start nginx
|
|
290 |
|
291 |
- [Discord](https://discord.gg/4XxujFgUN7)
|
292 |
- [Twitter](https://twitter.com/infiniflowai)
|
|
|
293 |
|
294 |
## 🙌 贡献指南
|
295 |
|
|
|
23 |
</a>
|
24 |
</p>
|
25 |
|
26 |
+
<h4 align="center">
|
27 |
+
<a href="https://ragflow.io/docs/dev/">Document</a> |
|
28 |
+
<a href="https://github.com/infiniflow/ragflow/issues/162">Roadmap</a> |
|
29 |
+
<a href="https://twitter.com/infiniflowai">Twitter</a> |
|
30 |
+
<a href="https://discord.gg/jEfRUwEYEV">Discord</a> |
|
31 |
+
<a href="https://demo.ragflow.io">Demo</a>
|
32 |
+
</h4>
|
33 |
+
|
34 |
## 💡 RAGFlow 是什么?
|
35 |
|
36 |
[RAGFlow](https://ragflow.io/) 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎。RAGFlow 可以为各种规模的企业及个人提供一套精简的 RAG 工作流程,结合大语言模型(LLM)针对用户各类不同的复杂格式数据提供可靠的问答以及有理有据的引用。
|
|
|
47 |
- 2024-05-21 支持流式结果输出和文本块获取API。
|
48 |
- 2024-05-15 集成大模型 OpenAI GPT-4o。
|
49 |
- 2024-05-08 集成大模型 DeepSeek。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
## 🌟 主要功能
|
52 |
|
|
|
160 |
> 上面这个例子中,您只需输入 http://IP_OF_YOUR_MACHINE 即可:未改动过配置则无需输入端口(默认的 HTTP 服务端口 80)。
|
161 |
6. 在 [service_conf.yaml](./docker/service_conf.yaml) 文件的 `user_default_llm` 栏配置 LLM factory,并在 `API_KEY` 栏填写和你选择的大模型相对应的 API key。
|
162 |
|
163 |
+
> 详见 [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup)。
|
164 |
|
165 |
_好戏开始,接着奏乐接着舞!_
|
166 |
|
|
|
280 |
```
|
281 |
## 📚 技术文档
|
282 |
|
283 |
+
- [Quickstart](https://ragflow.io/docs/dev/)
|
284 |
+
- [User guide](https://ragflow.io/docs/dev/category/user-guides)
|
285 |
+
- [Reference](https://ragflow.io/docs/dev/category/references)
|
286 |
+
- [FAQ](https://ragflow.io/docs/dev/faq)
|
287 |
|
288 |
## 📜 路线图
|
289 |
|
|
|
293 |
|
294 |
- [Discord](https://discord.gg/4XxujFgUN7)
|
295 |
- [Twitter](https://twitter.com/infiniflowai)
|
296 |
+
- [GitHub Discussions](https://github.com/orgs/infiniflow/discussions)
|
297 |
|
298 |
## 🙌 贡献指南
|
299 |
|
deepdoc/parser/__init__.py
CHANGED
@@ -1,4 +1,15 @@
|
|
1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
3 |
from .pdf_parser import RAGFlowPdfParser as PdfParser, PlainParser
|
4 |
from .docx_parser import RAGFlowDocxParser as DocxParser
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
|
14 |
from .pdf_parser import RAGFlowPdfParser as PdfParser, PlainParser
|
15 |
from .docx_parser import RAGFlowDocxParser as DocxParser
|
deepdoc/parser/docx_parser.py
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
from docx import Document
|
3 |
import re
|
4 |
import pandas as pd
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
from docx import Document
|
15 |
import re
|
16 |
import pandas as pd
|
deepdoc/parser/excel_parser.py
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
from openpyxl import load_workbook
|
3 |
import sys
|
4 |
from io import BytesIO
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
from openpyxl import load_workbook
|
15 |
import sys
|
16 |
from io import BytesIO
|
deepdoc/parser/pdf_parser.py
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
import os
|
3 |
import random
|
4 |
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import os
|
15 |
import random
|
16 |
|
deepdoc/parser/ppt_parser.py
CHANGED
@@ -10,6 +10,7 @@
|
|
10 |
# See the License for the specific language governing permissions and
|
11 |
# limitations under the License.
|
12 |
#
|
|
|
13 |
from io import BytesIO
|
14 |
from pptx import Presentation
|
15 |
|
|
|
10 |
# See the License for the specific language governing permissions and
|
11 |
# limitations under the License.
|
12 |
#
|
13 |
+
|
14 |
from io import BytesIO
|
15 |
from pptx import Presentation
|
16 |
|
deepdoc/parser/resume/__init__.py
CHANGED
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import datetime
|
2 |
|
3 |
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import datetime
|
15 |
|
16 |
|
deepdoc/parser/resume/entities/corporations.py
CHANGED
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import re,json,os
|
2 |
import pandas as pd
|
3 |
from rag.nlp import rag_tokenizer
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import re,json,os
|
15 |
import pandas as pd
|
16 |
from rag.nlp import rag_tokenizer
|
deepdoc/parser/resume/entities/degrees.py
CHANGED
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
TBL = {"94":"EMBA",
|
2 |
"6":"MBA",
|
3 |
"95":"MPA",
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
TBL = {"94":"EMBA",
|
15 |
"6":"MBA",
|
16 |
"95":"MPA",
|
deepdoc/parser/resume/entities/industries.py
CHANGED
@@ -1,3 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
2 |
TBL = {"1":{"name":"IT/通信/电子","parent":"0"},
|
3 |
"2":{"name":"互联网","parent":"0"},
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
|
14 |
TBL = {"1":{"name":"IT/通信/电子","parent":"0"},
|
15 |
"2":{"name":"互联网","parent":"0"},
|
deepdoc/parser/resume/entities/regions.py
CHANGED
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
TBL = {
|
2 |
"2":{"name":"北京","parent":"1"},
|
3 |
"3":{"name":"天津","parent":"1"},
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
TBL = {
|
15 |
"2":{"name":"北京","parent":"1"},
|
16 |
"3":{"name":"天津","parent":"1"},
|
deepdoc/parser/resume/entities/schools.py
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
import os, json,re,copy
|
3 |
import pandas as pd
|
4 |
current_file_path = os.path.dirname(os.path.abspath(__file__))
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import os, json,re,copy
|
15 |
import pandas as pd
|
16 |
current_file_path = os.path.dirname(os.path.abspath(__file__))
|
deepdoc/parser/resume/step_one.py
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
import json
|
3 |
from deepdoc.parser.resume.entities import degrees, regions, industries
|
4 |
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import json
|
15 |
from deepdoc.parser.resume.entities import degrees, regions, industries
|
16 |
|
deepdoc/parser/resume/step_two.py
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
import re, copy, time, datetime, demjson3, \
|
3 |
traceback, signal
|
4 |
import numpy as np
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import re, copy, time, datetime, demjson3, \
|
15 |
traceback, signal
|
16 |
import numpy as np
|
deepdoc/vision/__init__.py
CHANGED
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import pdfplumber
|
2 |
|
3 |
from .ocr import OCR
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import pdfplumber
|
15 |
|
16 |
from .ocr import OCR
|
deepdoc/vision/postprocess.py
CHANGED
@@ -1,3 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import copy
|
2 |
import re
|
3 |
import numpy as np
|
|
|
1 |
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
2 |
+
# you may not use this file except in compliance with the License.
|
3 |
+
# You may obtain a copy of the License at
|
4 |
+
#
|
5 |
+
# http://www.apache.org/licenses/LICENSE-2.0
|
6 |
+
#
|
7 |
+
# Unless required by applicable law or agreed to in writing, software
|
8 |
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
9 |
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10 |
+
# See the License for the specific language governing permissions and
|
11 |
+
# limitations under the License.
|
12 |
+
#
|
13 |
+
|
14 |
import copy
|
15 |
import re
|
16 |
import numpy as np
|