qingxu98 commited on
Commit
a1fe67d
1 Parent(s): cbee909
README.md CHANGED
@@ -39,7 +39,7 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
39
 
40
  功能(⭐= 近期新增功能) | 描述
41
  --- | ---
42
- ⭐[接入新模型](https://github.com/binary-husky/gpt_academic/wiki/%E5%A6%82%E4%BD%95%E5%88%87%E6%8D%A2%E6%A8%A1%E5%9E%8B)! | ⭐阿里达摩院[通义千问](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary),上海AI-Lab[书生](https://github.com/InternLM/InternLM),讯飞[星火](https://xinghuo.xfyun.cn/)
43
  一键润色 | 支持一键润色、一键查找论文语法错误
44
  一键中英互译 | 一键中英互译
45
  一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
@@ -178,7 +178,7 @@ docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
178
  ```
179
  P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用docker-compose获取Latex功能(修改docker-compose.yml,保留方案4并删除其他方案)。
180
 
181
- 2. ChatGPT + ChatGLM2 + MOSS(需要熟悉Docker
182
  [![chatglm](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml)
183
 
184
  ``` sh
@@ -186,7 +186,7 @@ P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以
186
  docker-compose up
187
  ```
188
 
189
- 3. ChatGPT + LLAMA + 盘古 + RWKV(需要熟悉Docker
190
  [![jittorllms](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-jittorllms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-jittorllms.yml)
191
 
192
  ``` sh
@@ -313,6 +313,8 @@ Tip:不指定文件直接点击 `载入对话历史存档` 可以查看历史h
313
 
314
  ### II:版本:
315
  - version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
 
 
316
  - version 3.46: 支持完全脱手操作的实时语音对话
317
  - version 3.45: 支持自定义ChatGLM2微调模型
318
  - version 3.44: 正式支持Azure,优化界面易用性
 
39
 
40
  功能(⭐= 近期新增功能) | 描述
41
  --- | ---
42
+ ⭐[接入新模型](https://github.com/binary-husky/gpt_academic/wiki/%E5%A6%82%E4%BD%95%E5%88%87%E6%8D%A2%E6%A8%A1%E5%9E%8B)! | 百度[千帆](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Nlks5zkzu)与文心一言, [通义千问](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary),上海AI-Lab[书生](https://github.com/InternLM/InternLM),讯飞[星火](https://xinghuo.xfyun.cn/),[LLaMa2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
43
  一键润色 | 支持一键润色、一键查找论文语法错误
44
  一键中英互译 | 一键中英互译
45
  一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
 
178
  ```
179
  P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用docker-compose获取Latex功能(修改docker-compose.yml,保留方案4并删除其他方案)。
180
 
181
+ 2. ChatGPT + ChatGLM2 + MOSS + LLAMA2 + 通义千问(需要熟悉[Nvidia Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian)运行时)
182
  [![chatglm](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml)
183
 
184
  ``` sh
 
186
  docker-compose up
187
  ```
188
 
189
+ 3. ChatGPT + LLAMA + 盘古 + RWKV(需要熟悉[Nvidia Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian)运行时)
190
  [![jittorllms](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-jittorllms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-jittorllms.yml)
191
 
192
  ``` sh
 
313
 
314
  ### II:版本:
315
  - version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
316
+ - version 3.49: 支持百度千帆平台和文心一言
317
+ - version 3.48: 支持阿里达摩院通义千问,上海AI-Lab书生,讯飞星火
318
  - version 3.46: 支持完全脱手操作的实时语音对话
319
  - version 3.45: 支持自定义ChatGLM2微调模型
320
  - version 3.44: 正式支持Azure,优化界面易用性
app.py CHANGED
@@ -4,7 +4,7 @@ def main():
4
  import subprocess, sys
5
  subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'gradio-stable-fork'])
6
  import gradio as gr
7
- if gr.__version__ not in ['3.28.3','3.32.3']: assert False, "需要特殊依赖,请务必用 pip install -r requirements.txt 指令安装依赖,详情信息见requirements.txt"
8
  from request_llm.bridge_all import predict
9
  from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf, ArgsGeneralWrapper, load_chat_cookies, DummyWith
10
  # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
 
4
  import subprocess, sys
5
  subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'gradio-stable-fork'])
6
  import gradio as gr
7
+ if gr.__version__ not in ['3.28.3','3.32.2']: assert False, "需要特殊依赖,请务必用 pip install -r requirements.txt 指令安装依赖,详情信息见requirements.txt"
8
  from request_llm.bridge_all import predict
9
  from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf, ArgsGeneralWrapper, load_chat_cookies, DummyWith
10
  # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
config.py CHANGED
@@ -11,7 +11,11 @@
11
  API_KEY = "此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
12
 
13
 
14
- # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改
 
 
 
 
15
  USE_PROXY = False
16
  if USE_PROXY:
17
  """
@@ -69,7 +73,7 @@ MAX_RETRY = 2
69
 
70
  # OpenAI模型选择是(gpt4现在只对申请成功的人开放)
71
  LLM_MODEL = "gpt-3.5-turbo" # 可选 "chatglm"
72
- AVAIL_LLM_MODELS = ["newbing-free", "gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "api2d-gpt-3.5-turbo"]
73
 
74
  # ChatGLM(2) Finetune Model Path (如果使用ChatGLM2微调模型,需要把"chatglmft"加入AVAIL_LLM_MODELS中)
75
  ChatGLM_PTUNING_CHECKPOINT = "" # 例如"/home/hmp/ChatGLM2-6B/ptuning/output/6b-pt-128-1e-2/checkpoint-100"
@@ -147,3 +151,70 @@ ANTHROPIC_API_KEY = ""
147
 
148
  # 自定义API KEY格式
149
  CUSTOM_API_KEY_PATTERN = ""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  API_KEY = "此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
12
 
13
 
14
+ # [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下,还需要填写组织(格式如org-123456789abcdefghijklmno的),请向下翻,找 API_ORG 设置项
15
+ API_KEY = "此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
16
+
17
+
18
+ # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改;如果使用本地或无地域限制的大模型时,此处也不需要修改
19
  USE_PROXY = False
20
  if USE_PROXY:
21
  """
 
73
 
74
  # OpenAI模型选择是(gpt4现在只对申请成功的人开放)
75
  LLM_MODEL = "gpt-3.5-turbo" # 可选 "chatglm"
76
+ AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "api2d-gpt-3.5-turbo", "spark", "azure-gpt-3.5"]
77
 
78
  # ChatGLM(2) Finetune Model Path (如果使用ChatGLM2微调模型,需要把"chatglmft"加入AVAIL_LLM_MODELS中)
79
  ChatGLM_PTUNING_CHECKPOINT = "" # 例如"/home/hmp/ChatGLM2-6B/ptuning/output/6b-pt-128-1e-2/checkpoint-100"
 
151
 
152
  # 自定义API KEY格式
153
  CUSTOM_API_KEY_PATTERN = ""
154
+
155
+
156
+ # HUGGINGFACE的TOKEN,下载LLAMA时起作用 https://huggingface.co/docs/hub/security-tokens
157
+ HUGGINGFACE_ACCESS_TOKEN = "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"
158
+
159
+
160
+ # GROBID服务器地址(填写多个可以均衡负载),用于高质量地读取PDF文档
161
+ # 获取方法:复制以下空间https://huggingface.co/spaces/qingxu98/grobid,设为public,然后GROBID_URL = "https://(你的hf用户名如qingxu98)-(你的填写的空间名如grobid).hf.space"
162
+ GROBID_URLS = [
163
+ "https://qingxu98-grobid.hf.space","https://qingxu98-grobid2.hf.space","https://qingxu98-grobid3.hf.space",
164
+ "https://shaocongma-grobid.hf.space","https://FBR123-grobid.hf.space",
165
+ ]
166
+
167
+
168
+
169
+ """
170
+ 在线大模型配置关联关系示意图
171
+
172
+ ├── "gpt-3.5-turbo" 等openai模型
173
+ │ ├── API_KEY
174
+ │ ├── CUSTOM_API_KEY_PATTERN(不常用)
175
+ │ ├── API_ORG(不常用)
176
+ │ └── API_URL_REDIRECT(不常用)
177
+
178
+ ├── "azure-gpt-3.5" 等azure模型
179
+ │ ├── API_KEY
180
+ │ ├── AZURE_ENDPOINT
181
+ │ ├── AZURE_API_KEY
182
+ │ ├── AZURE_ENGINE
183
+ │ └── API_URL_REDIRECT
184
+
185
+ ├── "spark" 星火认知大模型
186
+ │ ├── XFYUN_APPID
187
+ │ ├── XFYUN_API_SECRET
188
+ │ └── XFYUN_API_KEY
189
+
190
+ ├── "claude-1-100k" 等claude模型
191
+ │ └── ANTHROPIC_API_KEY
192
+
193
+ ├── "stack-claude"
194
+ │ ├── SLACK_CLAUDE_BOT_ID
195
+ │ └── SLACK_CLAUDE_USER_TOKEN
196
+
197
+ ├── "qianfan" 百度千帆大模型库
198
+ │ ├── BAIDU_CLOUD_QIANFAN_MODEL
199
+ │ ├── BAIDU_CLOUD_API_KEY
200
+ │ └── BAIDU_CLOUD_SECRET_KEY
201
+
202
+ ├── "newbing" Newbing接口不再稳定,不推荐使用
203
+ ├── NEWBING_STYLE
204
+ └── NEWBING_COOKIES
205
+
206
+
207
+
208
+ 插件在线服务配置依赖关系示意图
209
+
210
+ ├── 语音功能
211
+ │ ├── ENABLE_AUDIO
212
+ │ ├── ALIYUN_TOKEN
213
+ │ ├── ALIYUN_APPKEY
214
+ │ ├── ALIYUN_ACCESSKEY
215
+ │ └── ALIYUN_SECRET
216
+
217
+ ├── PDF文档精准解析
218
+ │ └── GROBID_URLS
219
+
220
+ """
crazy_functional.py CHANGED
@@ -24,6 +24,7 @@ def get_crazy_functions():
24
  from crazy_functions.对话历史存档 import 对话历史存档
25
  from crazy_functions.对话历史存档 import 载入对话历史存档
26
  from crazy_functions.对话历史存档 import 删除所有本地对话历史记录
 
27
 
28
  from crazy_functions.批量Markdown翻译 import Markdown英译中
29
  function_plugins = {
@@ -40,7 +41,12 @@ def get_crazy_functions():
40
  "AsButton":False,
41
  "Function": HotReload(删除所有本地对话历史记录)
42
  },
43
- "[测试功能] 解析Jupyter Notebook文件": {
 
 
 
 
 
44
  "Color": "stop",
45
  "AsButton":False,
46
  "Function": HotReload(解析ipynb文件),
@@ -328,7 +334,7 @@ def get_crazy_functions():
328
  try:
329
  from crazy_functions.Langchain知识库 import 知识库问答
330
  function_plugins.update({
331
- "[功能尚不稳定] 构建知识库(请先上传文件素材)": {
332
  "Color": "stop",
333
  "AsButton": False,
334
  "AdvancedArgs": True,
@@ -342,7 +348,7 @@ def get_crazy_functions():
342
  try:
343
  from crazy_functions.Langchain知识库 import 读取知识库作答
344
  function_plugins.update({
345
- "[功能尚不稳定] 知识库问答": {
346
  "Color": "stop",
347
  "AsButton": False,
348
  "AdvancedArgs": True,
@@ -352,6 +358,32 @@ def get_crazy_functions():
352
  })
353
  except:
354
  print('Load function plugin failed')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
355
 
356
  try:
357
  from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比
@@ -366,7 +398,7 @@ def get_crazy_functions():
366
  })
367
  from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
368
  function_plugins.update({
369
- "Arixv翻译(输入arxivID)[需Latex]": {
370
  "Color": "stop",
371
  "AsButton": False,
372
  "AdvancedArgs": True,
@@ -377,7 +409,7 @@ def get_crazy_functions():
377
  }
378
  })
379
  function_plugins.update({
380
- "本地论文翻译(上传Latex压缩包)[需Latex]": {
381
  "Color": "stop",
382
  "AsButton": False,
383
  "AdvancedArgs": True,
 
24
  from crazy_functions.对话历史存档 import 对话历史存档
25
  from crazy_functions.对话历史存档 import 载入对话历史存档
26
  from crazy_functions.对话历史存档 import 删除所有本地对话历史记录
27
+ from crazy_functions.辅助功能 import 清除缓存
28
 
29
  from crazy_functions.批量Markdown翻译 import Markdown英译中
30
  function_plugins = {
 
41
  "AsButton":False,
42
  "Function": HotReload(删除所有本地对话历史记录)
43
  },
44
+ "清除所有缓存文件(请谨慎操作)": {
45
+ "Color": "stop",
46
+ "AsButton": False, # 加入下拉菜单中
47
+ "Function": HotReload(清除缓存)
48
+ },
49
+ "解析Jupyter Notebook文件": {
50
  "Color": "stop",
51
  "AsButton":False,
52
  "Function": HotReload(解析ipynb文件),
 
334
  try:
335
  from crazy_functions.Langchain知识库 import 知识库问答
336
  function_plugins.update({
337
+ "构建知识库(请先上传文件素材)": {
338
  "Color": "stop",
339
  "AsButton": False,
340
  "AdvancedArgs": True,
 
348
  try:
349
  from crazy_functions.Langchain知识库 import 读取知识库作答
350
  function_plugins.update({
351
+ "知识库问答": {
352
  "Color": "stop",
353
  "AsButton": False,
354
  "AdvancedArgs": True,
 
358
  })
359
  except:
360
  print('Load function plugin failed')
361
+
362
+ try:
363
+ from crazy_functions.交互功能函数模板 import 交互功能模板函数
364
+ function_plugins.update({
365
+ "交互功能模板函数": {
366
+ "Color": "stop",
367
+ "AsButton": False,
368
+ "Function": HotReload(交互功能模板函数)
369
+ }
370
+ })
371
+ except:
372
+ print('Load function plugin failed')
373
+
374
+ # try:
375
+ # from crazy_functions.chatglm微调工具 import 微调数据集生成
376
+ # function_plugins.update({
377
+ # "黑盒模型学习: 微调数据集生成 (先上传数据集)": {
378
+ # "Color": "stop",
379
+ # "AsButton": False,
380
+ # "AdvancedArgs": True,
381
+ # "ArgsReminder": "针对数据集输入(如 绿帽子*深蓝色衬衫*黑色运动裤)给出指令,例如您可以将以下命令复制到下方: --llm_to_learn=azure-gpt-3.5 --prompt_prefix='根据下面的服装类型提示,想象一个穿着者,对这个人外貌、身处的环境、内心世界、过去经历进行描写。要求:100字以内,用第二人称。' --system_prompt=''",
382
+ # "Function": HotReload(微调数据集生成)
383
+ # }
384
+ # })
385
+ # except:
386
+ # print('Load function plugin failed')
387
 
388
  try:
389
  from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比
 
398
  })
399
  from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
400
  function_plugins.update({
401
+ "Arixv论文精细翻译(输入arxivID)[需Latex]": {
402
  "Color": "stop",
403
  "AsButton": False,
404
  "AdvancedArgs": True,
 
409
  }
410
  })
411
  function_plugins.update({
412
+ "本地Latex论文精细翻译(上传Latex项目)[需Latex]": {
413
  "Color": "stop",
414
  "AsButton": False,
415
  "AdvancedArgs": True,
crazy_functions/latex_fns/latex_toolbox.py CHANGED
@@ -281,9 +281,12 @@ def rm_comments(main_file):
281
  def find_tex_file_ignore_case(fp):
282
  dir_name = os.path.dirname(fp)
283
  base_name = os.path.basename(fp)
 
 
 
284
  if not base_name.endswith('.tex'): base_name+='.tex'
285
  if os.path.exists(pj(dir_name, base_name)): return pj(dir_name, base_name)
286
- # go case in-sensitive
287
  import glob
288
  for f in glob.glob(dir_name+'/*.tex'):
289
  base_name_s = os.path.basename(fp)
 
281
  def find_tex_file_ignore_case(fp):
282
  dir_name = os.path.dirname(fp)
283
  base_name = os.path.basename(fp)
284
+ # 如果输入的文件路径是正确的
285
+ if os.path.exists(pj(dir_name, base_name)): return pj(dir_name, base_name)
286
+ # 如果不正确,试着加上.tex后缀试试
287
  if not base_name.endswith('.tex'): base_name+='.tex'
288
  if os.path.exists(pj(dir_name, base_name)): return pj(dir_name, base_name)
289
+ # 如果还找不到,解除大小写限制,再试一次
290
  import glob
291
  for f in glob.glob(dir_name+'/*.tex'):
292
  base_name_s = os.path.basename(fp)
crazy_functions/pdf_fns/parse_pdf.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+ import random
3
+ from functools import lru_cache
4
+ class GROBID_OFFLINE_EXCEPTION(Exception): pass
5
+
6
+ def get_avail_grobid_url():
7
+ from toolbox import get_conf
8
+ GROBID_URLS, = get_conf('GROBID_URLS')
9
+ if len(GROBID_URLS) == 0: return None
10
+ try:
11
+ _grobid_url = random.choice(GROBID_URLS) # 随机负载均衡
12
+ if _grobid_url.endswith('/'): _grobid_url = _grobid_url.rstrip('/')
13
+ res = requests.get(_grobid_url+'/api/isalive')
14
+ if res.text=='true': return _grobid_url
15
+ else: return None
16
+ except:
17
+ return None
18
+
19
+ @lru_cache(maxsize=32)
20
+ def parse_pdf(pdf_path, grobid_url):
21
+ import scipdf # pip install scipdf_parser
22
+ if grobid_url.endswith('/'): grobid_url = grobid_url.rstrip('/')
23
+ article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
24
+ return article_dict
25
+
crazy_functions/批量翻译PDF文档_多线程.py CHANGED
@@ -1,15 +1,19 @@
1
  from toolbox import CatchException, report_execption, write_results_to_file
2
- from toolbox import update_ui, promote_file_to_downloadzone
 
3
  from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
4
  from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
5
  from .crazy_utils import read_and_clean_pdf_text
 
6
  from colorful import *
 
 
 
7
 
8
  @CatchException
9
- def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, sys_prompt, web_port):
10
- import glob
11
- import os
12
 
 
13
  # 基本信息:功能、贡献者
14
  chatbot.append([
15
  "函数插件功能?",
@@ -30,20 +34,11 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, sys_
30
  # 清空历史,以免输入溢出
31
  history = []
32
 
 
 
33
  # 检测输入参数,如没有给定输入参数,直接退出
34
- if os.path.exists(txt):
35
- project_folder = txt
36
- else:
37
- if txt == "":
38
- txt = '空空如也的输入栏'
39
- report_execption(chatbot, history,
40
- a=f"解析项目: {txt}", b=f"找不到本地项目或无权访问: {txt}")
41
- yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
42
- return
43
-
44
- # 搜索需要处理的文件清单
45
- file_manifest = [f for f in glob.glob(
46
- f'{project_folder}/**/*.pdf', recursive=True)]
47
 
48
  # 如果没找到任何文件
49
  if len(file_manifest) == 0:
@@ -53,22 +48,130 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, sys_
53
  return
54
 
55
  # 开始正式执行任务
56
- yield from 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, sys_prompt)
 
 
 
 
 
57
 
58
 
59
- def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, sys_prompt):
60
- import os
61
  import copy
62
  import tiktoken
63
  TOKEN_LIMIT_PER_FRAGMENT = 1280
64
  generated_conclusion_files = []
65
  generated_html_files = []
 
66
  for index, fp in enumerate(file_manifest):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
  # 读取PDF文件
69
  file_content, page_one = read_and_clean_pdf_text(fp)
70
  file_content = file_content.encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
71
  page_one = str(page_one).encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
 
72
  # 递归地切割PDF文件
73
  from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
74
  from request_llm.bridge_all import model_info
@@ -140,8 +243,7 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,
140
  trans = k
141
  ch.add_row(a=orig, b=trans)
142
  create_report_file_name = f"{os.path.basename(fp)}.trans.html"
143
- ch.save_file(create_report_file_name)
144
- generated_html_files.append(f'./gpt_log/{create_report_file_name}')
145
  except:
146
  from toolbox import trimmed_format_exc
147
  print('writing html result failed:', trimmed_format_exc())
@@ -202,6 +304,6 @@ class construct_html():
202
 
203
 
204
  def save_file(self, file_name):
205
- with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
206
  f.write(self.html_string.encode('utf-8', 'ignore').decode())
207
-
 
1
  from toolbox import CatchException, report_execption, write_results_to_file
2
+ from toolbox import update_ui, promote_file_to_downloadzone, update_ui_lastest_msg, disable_auto_promotion
3
+ from toolbox import write_history_to_file, get_log_folder
4
  from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
5
  from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
6
  from .crazy_utils import read_and_clean_pdf_text
7
+ from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url
8
  from colorful import *
9
+ import glob
10
+ import os
11
+ import math
12
 
13
  @CatchException
14
+ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
 
 
15
 
16
+ disable_auto_promotion(chatbot)
17
  # 基本信息:功能、贡献者
18
  chatbot.append([
19
  "函数插件功能?",
 
34
  # 清空历史,以免输入溢出
35
  history = []
36
 
37
+ from .crazy_utils import get_files_from_everything
38
+ success, file_manifest, project_folder = get_files_from_everything(txt, type='.pdf')
39
  # 检测输入参数,如没有给定输入参数,直接退出
40
+ if not success:
41
+ if txt == "": txt = '空空如也的输入栏'
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  # 如果没找到任何文件
44
  if len(file_manifest) == 0:
 
48
  return
49
 
50
  # 开始正式执行任务
51
+ grobid_url = get_avail_grobid_url()
52
+ if grobid_url is not None:
53
+ yield from 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, grobid_url)
54
+ else:
55
+ yield from update_ui_lastest_msg("GROBID服务不可用,请检查config中的GROBID_URL。作为替代,现在将执行效果稍差的旧版代码。", chatbot, history, delay=3)
56
+ yield from 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
57
 
58
 
59
+ def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, grobid_url):
 
60
  import copy
61
  import tiktoken
62
  TOKEN_LIMIT_PER_FRAGMENT = 1280
63
  generated_conclusion_files = []
64
  generated_html_files = []
65
+ DST_LANG = "中文"
66
  for index, fp in enumerate(file_manifest):
67
+ chatbot.append(["当前进度:", f"正在连接GROBID服务,请稍候: {grobid_url}\n如果等待时间过长,请修改config中的GROBID_URL,可修改成本地GROBID服务。"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
68
+ article_dict = parse_pdf(fp, grobid_url)
69
+ print(article_dict)
70
+ prompt = "以下是一篇学术论文的基本信息:\n"
71
+ # title
72
+ title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
73
+ # authors
74
+ authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
75
+ # abstract
76
+ abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
77
+ # command
78
+ prompt += f"请将题目和摘要翻译为{DST_LANG}。"
79
+ meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
80
+
81
+ # 单线,获取文章meta信息
82
+ paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
83
+ inputs=prompt,
84
+ inputs_show_user=prompt,
85
+ llm_kwargs=llm_kwargs,
86
+ chatbot=chatbot, history=[],
87
+ sys_prompt="You are an academic paper reader。",
88
+ )
89
+
90
+ # 多线,翻译
91
+ inputs_array = []
92
+ inputs_show_user_array = []
93
 
94
+ # get_token_num
95
+ from request_llm.bridge_all import model_info
96
+ enc = model_info[llm_kwargs['llm_model']]['tokenizer']
97
+ def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
98
+ from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
99
+
100
+ def break_down(txt):
101
+ raw_token_num = get_token_num(txt)
102
+ if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
103
+ return [txt]
104
+ else:
105
+ # raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
106
+ # find a smooth token limit to achieve even seperation
107
+ count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
108
+ token_limit_smooth = raw_token_num // count + count
109
+ return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
110
+
111
+ for section in article_dict.get('sections'):
112
+ if len(section['text']) == 0: continue
113
+ section_frags = break_down(section['text'])
114
+ for i, fragment in enumerate(section_frags):
115
+ heading = section['heading']
116
+ if len(section_frags) > 1: heading += f'Part-{i+1}'
117
+ inputs_array.append(
118
+ f"你需要翻译{heading}章节,内容如下: \n\n{fragment}"
119
+ )
120
+ inputs_show_user_array.append(
121
+ f"# {heading}\n\n{fragment}"
122
+ )
123
+
124
+ gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
125
+ inputs_array=inputs_array,
126
+ inputs_show_user_array=inputs_show_user_array,
127
+ llm_kwargs=llm_kwargs,
128
+ chatbot=chatbot,
129
+ history_array=[meta for _ in inputs_array],
130
+ sys_prompt_array=[
131
+ "请你作为一个学术翻译,负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
132
+ )
133
+ res_path = write_history_to_file(meta + ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=None, file_fullname=None)
134
+ promote_file_to_downloadzone(res_path, rename_file=os.path.basename(fp)+'.md', chatbot=chatbot)
135
+ generated_conclusion_files.append(res_path)
136
+
137
+ ch = construct_html()
138
+ orig = ""
139
+ trans = ""
140
+ gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
141
+ for i,k in enumerate(gpt_response_collection_html):
142
+ if i%2==0:
143
+ gpt_response_collection_html[i] = inputs_show_user_array[i//2]
144
+ else:
145
+ gpt_response_collection_html[i] = gpt_response_collection_html[i]
146
+
147
+ final = ["", "", "一、论文概况", "", "Abstract", paper_meta_info, "二、论文翻译", ""]
148
+ final.extend(gpt_response_collection_html)
149
+ for i, k in enumerate(final):
150
+ if i%2==0:
151
+ orig = k
152
+ if i%2==1:
153
+ trans = k
154
+ ch.add_row(a=orig, b=trans)
155
+ create_report_file_name = f"{os.path.basename(fp)}.trans.html"
156
+ html_file = ch.save_file(create_report_file_name)
157
+ generated_html_files.append(html_file)
158
+ promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
159
+
160
+ chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
161
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
162
+
163
+
164
+ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
165
+ import copy
166
+ TOKEN_LIMIT_PER_FRAGMENT = 1280
167
+ generated_conclusion_files = []
168
+ generated_html_files = []
169
+ for index, fp in enumerate(file_manifest):
170
  # 读取PDF文件
171
  file_content, page_one = read_and_clean_pdf_text(fp)
172
  file_content = file_content.encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
173
  page_one = str(page_one).encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
174
+
175
  # 递归地切割PDF文件
176
  from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
177
  from request_llm.bridge_all import model_info
 
243
  trans = k
244
  ch.add_row(a=orig, b=trans)
245
  create_report_file_name = f"{os.path.basename(fp)}.trans.html"
246
+ generated_html_files.append(ch.save_file(create_report_file_name))
 
247
  except:
248
  from toolbox import trimmed_format_exc
249
  print('writing html result failed:', trimmed_format_exc())
 
304
 
305
 
306
  def save_file(self, file_name):
307
+ with open(os.path.join(get_log_folder(), file_name), 'w', encoding='utf8') as f:
308
  f.write(self.html_string.encode('utf-8', 'ignore').decode())
309
+ return os.path.join(get_log_folder(), file_name)
crazy_functions/辅助功能.py ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # encoding: utf-8
2
+ # @Time : 2023/4/19
3
+ # @Author : Spike
4
+ # @Descr :
5
+ from toolbox import update_ui
6
+ from toolbox import CatchException, report_execption, write_results_to_file, get_log_folder
7
+ from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
8
+
9
+
10
+ @CatchException
11
+ def 猜你想问(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
12
+ if txt:
13
+ show_say = txt
14
+ prompt = txt+'\n回答完问题后,再列出用户可能提出的三个问题。'
15
+ else:
16
+ prompt = history[-1]+"\n分析上述回答,再列出用户可能提出的三个问题。"
17
+ show_say = '分析上述回答,再列出用户可能提出的三个问题。'
18
+ gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
19
+ inputs=prompt,
20
+ inputs_show_user=show_say,
21
+ llm_kwargs=llm_kwargs,
22
+ chatbot=chatbot,
23
+ history=history,
24
+ sys_prompt=system_prompt
25
+ )
26
+ chatbot[-1] = (show_say, gpt_say)
27
+ history.extend([show_say, gpt_say])
28
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
29
+
30
+
31
+ @CatchException
32
+ def 清除缓存(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
33
+ chatbot.append(['清除本地缓存数据', '执行中. 删除 gpt_log & private_upload'])
34
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
35
+
36
+ import shutil, os
37
+ gpt_log_dir = os.path.join(os.path.dirname(__file__), '..', 'gpt_log')
38
+ private_upload_dir = os.path.join(os.path.dirname(__file__), '..', 'private_upload')
39
+ shutil.rmtree(gpt_log_dir, ignore_errors=True)
40
+ shutil.rmtree(private_upload_dir, ignore_errors=True)
41
+
42
+ chatbot.append(['清除本地缓存数据', '执行完成'])
43
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
docker-compose.yml CHANGED
@@ -16,6 +16,7 @@ services:
16
  AVAIL_LLM_MODELS: ' ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "newbing"] '
17
  WEB_PORT: ' 22303 '
18
  ADD_WAIFU: ' True '
 
19
  # DEFAULT_WORKER_NUM: ' 10 '
20
  # AUTHENTICATION: ' [("username", "passwd"), ("username2", "passwd2")] '
21
 
@@ -28,7 +29,7 @@ services:
28
 
29
 
30
  ### ===================================================
31
- ### 【方案二】 如果需要运行ChatGLM本地模型
32
  ### ===================================================
33
  version: '3'
34
  services:
@@ -36,11 +37,11 @@ services:
36
  image: ghcr.io/binary-husky/gpt_academic_chatglm_moss:master # (Auto Built by Dockerfile: docs/Dockerfile+ChatGLM)
37
  environment:
38
  # 请查阅 `config.py` 以查看所有的配置信息
39
- API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,fkxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
40
  USE_PROXY: ' True '
41
  proxies: ' { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", } '
42
  LLM_MODEL: ' gpt-3.5-turbo '
43
- AVAIL_LLM_MODELS: ' ["chatglm", "moss", "gpt-3.5-turbo", "gpt-4", "newbing"] '
44
  LOCAL_MODEL_DEVICE: ' cuda '
45
  DEFAULT_WORKER_NUM: ' 10 '
46
  WEB_PORT: ' 12303 '
@@ -57,6 +58,10 @@ services:
57
  command: >
58
  bash -c "python3 -u main.py"
59
 
 
 
 
 
60
  ### ===================================================
61
  ### 【方案三】 如果需要运行ChatGPT + LLAMA + 盘古 + RWKV本地模型
62
  ### ===================================================
 
16
  AVAIL_LLM_MODELS: ' ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "newbing"] '
17
  WEB_PORT: ' 22303 '
18
  ADD_WAIFU: ' True '
19
+ # THEME: ' Chuanhu-Small-and-Beautiful '
20
  # DEFAULT_WORKER_NUM: ' 10 '
21
  # AUTHENTICATION: ' [("username", "passwd"), ("username2", "passwd2")] '
22
 
 
29
 
30
 
31
  ### ===================================================
32
+ ### 【方案二】 如果需要运行ChatGLM + Qwen + MOSS等本地模型
33
  ### ===================================================
34
  version: '3'
35
  services:
 
37
  image: ghcr.io/binary-husky/gpt_academic_chatglm_moss:master # (Auto Built by Dockerfile: docs/Dockerfile+ChatGLM)
38
  environment:
39
  # 请查阅 `config.py` 以查看所有的配置信息
40
+ API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
41
  USE_PROXY: ' True '
42
  proxies: ' { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", } '
43
  LLM_MODEL: ' gpt-3.5-turbo '
44
+ AVAIL_LLM_MODELS: ' ["chatglm", "qwen", "moss", "gpt-3.5-turbo", "gpt-4", "newbing"] '
45
  LOCAL_MODEL_DEVICE: ' cuda '
46
  DEFAULT_WORKER_NUM: ' 10 '
47
  WEB_PORT: ' 12303 '
 
58
  command: >
59
  bash -c "python3 -u main.py"
60
 
61
+ # P.S. 通过对 command 进行微调,可以便捷地安装额外的依赖
62
+ # command: >
63
+ # bash -c "pip install -r request_llm/requirements_qwen.txt && python3 -u main.py"
64
+
65
  ### ===================================================
66
  ### 【方案三】 如果需要运行ChatGPT + LLAMA + 盘古 + RWKV本地模型
67
  ### ===================================================
docs/GithubAction+ChatGLM+Moss CHANGED
@@ -18,6 +18,7 @@ WORKDIR /gpt/gpt_academic
18
  RUN git clone https://github.com/OpenLMLab/MOSS.git request_llm/moss
19
  RUN python3 -m pip install -r requirements.txt
20
  RUN python3 -m pip install -r request_llm/requirements_moss.txt
 
21
  RUN python3 -m pip install -r request_llm/requirements_chatglm.txt
22
  RUN python3 -m pip install -r request_llm/requirements_newbing.txt
23
 
 
18
  RUN git clone https://github.com/OpenLMLab/MOSS.git request_llm/moss
19
  RUN python3 -m pip install -r requirements.txt
20
  RUN python3 -m pip install -r request_llm/requirements_moss.txt
21
+ RUN python3 -m pip install -r request_llm/requirements_qwen.txt
22
  RUN python3 -m pip install -r request_llm/requirements_chatglm.txt
23
  RUN python3 -m pip install -r request_llm/requirements_newbing.txt
24
 
request_llm/bridge_all.py CHANGED
@@ -19,6 +19,12 @@ from .bridge_chatgpt import predict as chatgpt_ui
19
  from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
20
  from .bridge_chatglm import predict as chatglm_ui
21
 
 
 
 
 
 
 
22
  colors = ['#FF00FF', '#00FFFF', '#FF0000', '#990099', '#009999', '#990044']
23
 
24
  class LazyloadTiktoken(object):
@@ -165,7 +171,14 @@ model_info = {
165
  "tokenizer": tokenizer_gpt35,
166
  "token_cnt": get_token_num_gpt35,
167
  },
168
-
 
 
 
 
 
 
 
169
  }
170
 
171
  # -=-=-=-=-=-=- 以下部分是新加入的模型,可能附带额外依赖 -=-=-=-=-=-=-
@@ -361,7 +374,7 @@ if "chatgpt_website" in AVAIL_LLM_MODELS: # 接入一些逆向工程https://gi
361
  "chatgpt_website": {
362
  "fn_with_ui": chatgpt_website_ui,
363
  "fn_without_ui": chatgpt_website_noui,
364
- "endpoint": None,
365
  "max_token": 4096,
366
  "tokenizer": tokenizer_gpt35,
367
  "token_cnt": get_token_num_gpt35,
@@ -385,6 +398,22 @@ if "spark" in AVAIL_LLM_MODELS: # 讯飞星火认知大模型
385
  })
386
  except:
387
  print(trimmed_format_exc())
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
388
 
389
 
390
 
 
19
  from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
20
  from .bridge_chatglm import predict as chatglm_ui
21
 
22
+ from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
23
+ from .bridge_chatglm import predict as chatglm_ui
24
+
25
+ from .bridge_qianfan import predict_no_ui_long_connection as qianfan_noui
26
+ from .bridge_qianfan import predict as qianfan_ui
27
+
28
  colors = ['#FF00FF', '#00FFFF', '#FF0000', '#990099', '#009999', '#990044']
29
 
30
  class LazyloadTiktoken(object):
 
171
  "tokenizer": tokenizer_gpt35,
172
  "token_cnt": get_token_num_gpt35,
173
  },
174
+ "qianfan": {
175
+ "fn_with_ui": qianfan_ui,
176
+ "fn_without_ui": qianfan_noui,
177
+ "endpoint": None,
178
+ "max_token": 2000,
179
+ "tokenizer": tokenizer_gpt35,
180
+ "token_cnt": get_token_num_gpt35,
181
+ },
182
  }
183
 
184
  # -=-=-=-=-=-=- 以下部分是新加入的模型,可能附带额外依赖 -=-=-=-=-=-=-
 
374
  "chatgpt_website": {
375
  "fn_with_ui": chatgpt_website_ui,
376
  "fn_without_ui": chatgpt_website_noui,
377
+ "endpoint": openai_endpoint,
378
  "max_token": 4096,
379
  "tokenizer": tokenizer_gpt35,
380
  "token_cnt": get_token_num_gpt35,
 
398
  })
399
  except:
400
  print(trimmed_format_exc())
401
+ if "llama2" in AVAIL_LLM_MODELS: # llama2
402
+ try:
403
+ from .bridge_llama2 import predict_no_ui_long_connection as llama2_noui
404
+ from .bridge_llama2 import predict as llama2_ui
405
+ model_info.update({
406
+ "llama2": {
407
+ "fn_with_ui": llama2_ui,
408
+ "fn_without_ui": llama2_noui,
409
+ "endpoint": None,
410
+ "max_token": 4096,
411
+ "tokenizer": tokenizer_gpt35,
412
+ "token_cnt": get_token_num_gpt35,
413
+ }
414
+ })
415
+ except:
416
+ print(trimmed_format_exc())
417
 
418
 
419
 
request_llm/bridge_chatgpt.py CHANGED
@@ -177,14 +177,13 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
177
  yield from update_ui(chatbot=chatbot, history=history, msg="非Openai官方接口返回了错误:" + chunk.decode()) # 刷新界面
178
  return
179
 
180
- # print(chunk.decode()[6:])
181
- if is_head_of_the_stream and (r'"object":"error"' not in chunk.decode()):
182
  # 数据流的第一帧不携带content
183
  is_head_of_the_stream = False; continue
184
 
185
  if chunk:
186
  try:
187
- chunk_decoded = chunk.decode()
188
  # 前者是API2D的结束条件,后者是OPENAI的结束条件
189
  if ('data: [DONE]' in chunk_decoded) or (len(json.loads(chunk_decoded[6:])['choices'][0]["delta"]) == 0):
190
  # 判定为数据流的结束,gpt_replying_buffer也写完了
@@ -192,7 +191,7 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
192
  break
193
  # 处理数据流的主体
194
  chunkjson = json.loads(chunk_decoded[6:])
195
- status_text = f"finish_reason: {chunkjson['choices'][0]['finish_reason']}"
196
  # 如果这里抛出异常,一般是文本过长,详情见get_full_error的输出
197
  gpt_replying_buffer = gpt_replying_buffer + json.loads(chunk_decoded[6:])['choices'][0]["delta"]["content"]
198
  history[-1] = gpt_replying_buffer
@@ -216,7 +215,6 @@ def handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg)
216
  history = clip_history(inputs=inputs, history=history, tokenizer=model_info[llm_kwargs['llm_model']]['tokenizer'],
217
  max_token_limit=(model_info[llm_kwargs['llm_model']]['max_token'])) # history至少释放二分之一
218
  chatbot[-1] = (chatbot[-1][0], "[Local Message] Reduce the length. 本次输入过长, 或历史数据过长. 历史缓存数据已部分释放, 您可以请再次尝试. (若再次失败则更可能是因为输入过长.)")
219
- # history = [] # 清除历史
220
  elif "does not exist" in error_msg:
221
  chatbot[-1] = (chatbot[-1][0], f"[Local Message] Model {llm_kwargs['llm_model']} does not exist. 模型不存在, 或者您没有获得体验资格.")
222
  elif "Incorrect API key" in error_msg:
 
177
  yield from update_ui(chatbot=chatbot, history=history, msg="非Openai官方接口返回了错误:" + chunk.decode()) # 刷新界面
178
  return
179
 
180
+ chunk_decoded = chunk.decode()
181
+ if is_head_of_the_stream and (r'"object":"error"' not in chunk_decoded) and (r"choices" not in chunk_decoded):
182
  # 数据流的第一帧不携带content
183
  is_head_of_the_stream = False; continue
184
 
185
  if chunk:
186
  try:
 
187
  # 前者是API2D的结束条件,后者是OPENAI的结束条件
188
  if ('data: [DONE]' in chunk_decoded) or (len(json.loads(chunk_decoded[6:])['choices'][0]["delta"]) == 0):
189
  # 判定为数据流的结束,gpt_replying_buffer也写完了
 
191
  break
192
  # 处理数据流的主体
193
  chunkjson = json.loads(chunk_decoded[6:])
194
+ status_text = f"finish_reason: {chunkjson['choices'][0].get('finish_reason', 'null')}"
195
  # 如果这里抛出异常,一般是文本过长,详情见get_full_error的输出
196
  gpt_replying_buffer = gpt_replying_buffer + json.loads(chunk_decoded[6:])['choices'][0]["delta"]["content"]
197
  history[-1] = gpt_replying_buffer
 
215
  history = clip_history(inputs=inputs, history=history, tokenizer=model_info[llm_kwargs['llm_model']]['tokenizer'],
216
  max_token_limit=(model_info[llm_kwargs['llm_model']]['max_token'])) # history至少释放二分之一
217
  chatbot[-1] = (chatbot[-1][0], "[Local Message] Reduce the length. 本次输入过长, 或历史数据过长. 历史缓存数据已部分释放, 您可以请再次尝试. (若再次失败则更可能是因为输入过长.)")
 
218
  elif "does not exist" in error_msg:
219
  chatbot[-1] = (chatbot[-1][0], f"[Local Message] Model {llm_kwargs['llm_model']} does not exist. 模型不存在, 或者您没有获得体验资格.")
220
  elif "Incorrect API key" in error_msg:
request_llm/bridge_chatgpt_website.py CHANGED
@@ -118,16 +118,6 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
118
  chatbot 为WebUI中显示的对话列表,修改它,然后yeild出去,可以直接修改对话界面内容
119
  additional_fn代表点击的哪个按钮,按钮见functional.py
120
  """
121
- if is_any_api_key(inputs):
122
- chatbot._cookies['api_key'] = inputs
123
- chatbot.append(("输入已识别为openai的api_key", what_keys(inputs)))
124
- yield from update_ui(chatbot=chatbot, history=history, msg="api_key已导入") # 刷新界面
125
- return
126
- elif not is_any_api_key(chatbot._cookies['api_key']):
127
- chatbot.append((inputs, "缺少api_key。\n\n1. 临时解决方案:直接在输入区键入api_key,然后回车提交。\n\n2. 长效解决方案:在config.py中配置。"))
128
- yield from update_ui(chatbot=chatbot, history=history, msg="缺少api_key") # 刷新界面
129
- return
130
-
131
  if additional_fn is not None:
132
  from core_functional import handle_core_functionality
133
  inputs, history = handle_core_functionality(additional_fn, inputs, history, chatbot)
@@ -245,14 +235,9 @@ def generate_payload(inputs, llm_kwargs, history, system_prompt, stream):
245
  if not is_any_api_key(llm_kwargs['api_key']):
246
  raise AssertionError("你提供了错误的API_KEY。\n\n1. 临时解决方案:直接在输入区键入api_key,然后回车提交。\n\n2. 长效解决方案:在config.py中配置。")
247
 
248
- api_key = select_api_key(llm_kwargs['api_key'], llm_kwargs['llm_model'])
249
-
250
  headers = {
251
  "Content-Type": "application/json",
252
- "Authorization": f"Bearer {api_key}"
253
  }
254
- if API_ORG.startswith('org-'): headers.update({"OpenAI-Organization": API_ORG})
255
- if llm_kwargs['llm_model'].startswith('azure-'): headers.update({"api-key": api_key})
256
 
257
  conversation_cnt = len(history) // 2
258
 
 
118
  chatbot 为WebUI中显示的对话列表,修改它,然后yeild出去,可以直接修改对话界面内容
119
  additional_fn代表点击的哪个按钮,按钮见functional.py
120
  """
 
 
 
 
 
 
 
 
 
 
121
  if additional_fn is not None:
122
  from core_functional import handle_core_functionality
123
  inputs, history = handle_core_functionality(additional_fn, inputs, history, chatbot)
 
235
  if not is_any_api_key(llm_kwargs['api_key']):
236
  raise AssertionError("你提供了错误的API_KEY。\n\n1. 临时解决方案:直接在输入区键入api_key,然后回车提交。\n\n2. 长效解决方案:在config.py中配置。")
237
 
 
 
238
  headers = {
239
  "Content-Type": "application/json",
 
240
  }
 
 
241
 
242
  conversation_cnt = len(history) // 2
243
 
request_llm/bridge_llama2.py ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model_name = "LLaMA"
2
+ cmd_to_install = "`pip install -r request_llm/requirements_chatglm.txt`"
3
+
4
+
5
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
6
+ from toolbox import update_ui, get_conf, ProxyNetworkActivate
7
+ from multiprocessing import Process, Pipe
8
+ from .local_llm_class import LocalLLMHandle, get_local_llm_predict_fns, SingletonLocalLLM
9
+ from threading import Thread
10
+
11
+
12
+ # ------------------------------------------------------------------------------------------------------------------------
13
+ # 🔌💻 Local Model
14
+ # ------------------------------------------------------------------------------------------------------------------------
15
+ @SingletonLocalLLM
16
+ class GetONNXGLMHandle(LocalLLMHandle):
17
+
18
+ def load_model_info(self):
19
+ # 🏃‍♂️🏃‍♂️🏃‍♂️ 子进程执行
20
+ self.model_name = model_name
21
+ self.cmd_to_install = cmd_to_install
22
+
23
+ def load_model_and_tokenizer(self):
24
+ # 🏃‍♂️🏃‍♂️🏃‍♂️ 子进程执行
25
+ import os, glob
26
+ import os
27
+ import platform
28
+ huggingface_token, device = get_conf('HUGGINGFACE_ACCESS_TOKEN', 'LOCAL_MODEL_DEVICE')
29
+ assert len(huggingface_token) != 0, "没有填写 HUGGINGFACE_ACCESS_TOKEN"
30
+ with open(os.path.expanduser('~/.cache/huggingface/token'), 'w') as f:
31
+ f.write(huggingface_token)
32
+ model_id = 'meta-llama/Llama-2-7b-chat-hf'
33
+ with ProxyNetworkActivate():
34
+ self._tokenizer = AutoTokenizer.from_pretrained(model_id, use_auth_token=huggingface_token)
35
+ # use fp16
36
+ model = AutoModelForCausalLM.from_pretrained(model_id, use_auth_token=huggingface_token).eval()
37
+ if device.startswith('cuda'): model = model.half().to(device)
38
+ self._model = model
39
+
40
+ return self._model, self._tokenizer
41
+
42
+ def llm_stream_generator(self, **kwargs):
43
+ # 🏃‍♂️🏃‍♂️🏃‍♂️ 子进程执行
44
+ def adaptor(kwargs):
45
+ query = kwargs['query']
46
+ max_length = kwargs['max_length']
47
+ top_p = kwargs['top_p']
48
+ temperature = kwargs['temperature']
49
+ history = kwargs['history']
50
+ console_slience = kwargs.get('console_slience', True)
51
+ return query, max_length, top_p, temperature, history, console_slience
52
+
53
+ def convert_messages_to_prompt(query, history):
54
+ prompt = ""
55
+ for a, b in history:
56
+ prompt += f"\n[INST]{a}[/INST]"
57
+ prompt += "\n{b}" + b
58
+ prompt += f"\n[INST]{query}[/INST]"
59
+ return prompt
60
+
61
+ query, max_length, top_p, temperature, history, console_slience = adaptor(kwargs)
62
+ prompt = convert_messages_to_prompt(query, history)
63
+ # =-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-
64
+ # code from transformers.llama
65
+ streamer = TextIteratorStreamer(self._tokenizer)
66
+ # Run the generation in a separate thread, so that we can fetch the generated text in a non-blocking way.
67
+ inputs = self._tokenizer([prompt], return_tensors="pt")
68
+ prompt_tk_back = self._tokenizer.batch_decode(inputs['input_ids'])[0]
69
+
70
+ generation_kwargs = dict(inputs.to(self._model.device), streamer=streamer, max_new_tokens=max_length)
71
+ thread = Thread(target=self._model.generate, kwargs=generation_kwargs)
72
+ thread.start()
73
+ generated_text = ""
74
+ for new_text in streamer:
75
+ generated_text += new_text
76
+ if not console_slience: print(new_text, end='')
77
+ yield generated_text.lstrip(prompt_tk_back).rstrip("</s>")
78
+ if not console_slience: print()
79
+ # =-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-
80
+
81
+ def try_to_import_special_deps(self, **kwargs):
82
+ # import something that will raise error if the user does not install requirement_*.txt
83
+ # 🏃‍♂️🏃‍♂️🏃‍♂️ 主进程执行
84
+ import importlib
85
+ importlib.import_module('transformers')
86
+
87
+
88
+ # ------------------------------------------------------------------------------------------------------------------------
89
+ # 🔌💻 GPT-Academic Interface
90
+ # ------------------------------------------------------------------------------------------------------------------------
91
+ predict_no_ui_long_connection, predict = get_local_llm_predict_fns(GetONNXGLMHandle, model_name)
request_llm/bridge_qianfan.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ import time, requests, json
3
+ from multiprocessing import Process, Pipe
4
+ from functools import wraps
5
+ from datetime import datetime, timedelta
6
+ from toolbox import get_conf, update_ui, is_any_api_key, select_api_key, what_keys, clip_history, trimmed_format_exc, get_conf
7
+
8
+ model_name = '千帆大模型平台'
9
+ timeout_bot_msg = '[Local Message] Request timeout. Network error.'
10
+
11
+ def cache_decorator(timeout):
12
+ cache = {}
13
+ def decorator(func):
14
+ @wraps(func)
15
+ def wrapper(*args, **kwargs):
16
+ key = (func.__name__, args, frozenset(kwargs.items()))
17
+ # Check if result is already cached and not expired
18
+ if key in cache:
19
+ result, timestamp = cache[key]
20
+ if datetime.now() - timestamp < timedelta(seconds=timeout):
21
+ return result
22
+
23
+ # Call the function and cache the result
24
+ result = func(*args, **kwargs)
25
+ cache[key] = (result, datetime.now())
26
+ return result
27
+ return wrapper
28
+ return decorator
29
+
30
+ @cache_decorator(timeout=3600)
31
+ def get_access_token():
32
+ """
33
+ 使用 AK,SK 生成鉴权签名(Access Token)
34
+ :return: access_token,或是None(如果错误)
35
+ """
36
+ # if (access_token_cache is None) or (time.time() - last_access_token_obtain_time > 3600):
37
+ BAIDU_CLOUD_API_KEY, BAIDU_CLOUD_SECRET_KEY = get_conf('BAIDU_CLOUD_API_KEY', 'BAIDU_CLOUD_SECRET_KEY')
38
+
39
+ if len(BAIDU_CLOUD_SECRET_KEY) == 0: raise RuntimeError("没有配置BAIDU_CLOUD_SECRET_KEY")
40
+ if len(BAIDU_CLOUD_API_KEY) == 0: raise RuntimeError("没有配置BAIDU_CLOUD_API_KEY")
41
+
42
+ url = "https://aip.baidubce.com/oauth/2.0/token"
43
+ params = {"grant_type": "client_credentials", "client_id": BAIDU_CLOUD_API_KEY, "client_secret": BAIDU_CLOUD_SECRET_KEY}
44
+ access_token_cache = str(requests.post(url, params=params).json().get("access_token"))
45
+ return access_token_cache
46
+ # else:
47
+ # return access_token_cache
48
+
49
+
50
+ def generate_message_payload(inputs, llm_kwargs, history, system_prompt):
51
+ conversation_cnt = len(history) // 2
52
+ messages = [{"role": "user", "content": system_prompt}]
53
+ messages.append({"role": "assistant", "content": 'Certainly!'})
54
+ if conversation_cnt:
55
+ for index in range(0, 2*conversation_cnt, 2):
56
+ what_i_have_asked = {}
57
+ what_i_have_asked["role"] = "user"
58
+ what_i_have_asked["content"] = history[index]
59
+ what_gpt_answer = {}
60
+ what_gpt_answer["role"] = "assistant"
61
+ what_gpt_answer["content"] = history[index+1]
62
+ if what_i_have_asked["content"] != "":
63
+ if what_gpt_answer["content"] == "": continue
64
+ if what_gpt_answer["content"] == timeout_bot_msg: continue
65
+ messages.append(what_i_have_asked)
66
+ messages.append(what_gpt_answer)
67
+ else:
68
+ messages[-1]['content'] = what_gpt_answer['content']
69
+ what_i_ask_now = {}
70
+ what_i_ask_now["role"] = "user"
71
+ what_i_ask_now["content"] = inputs
72
+ messages.append(what_i_ask_now)
73
+ return messages
74
+
75
+
76
+ def generate_from_baidu_qianfan(inputs, llm_kwargs, history, system_prompt):
77
+ BAIDU_CLOUD_QIANFAN_MODEL, = get_conf('BAIDU_CLOUD_QIANFAN_MODEL')
78
+
79
+ url_lib = {
80
+ "ERNIE-Bot": "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions" ,
81
+ "ERNIE-Bot-turbo": "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/eb-instant" ,
82
+ "BLOOMZ-7B": "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/bloomz_7b1",
83
+
84
+ "Llama-2-70B-Chat": "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/llama_2_70b",
85
+ "Llama-2-13B-Chat": "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/llama_2_13b",
86
+ "Llama-2-7B-Chat": "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/llama_2_7b",
87
+ }
88
+
89
+ url = url_lib[BAIDU_CLOUD_QIANFAN_MODEL]
90
+
91
+ url += "?access_token=" + get_access_token()
92
+
93
+
94
+ payload = json.dumps({
95
+ "messages": generate_message_payload(inputs, llm_kwargs, history, system_prompt),
96
+ "stream": True
97
+ })
98
+ headers = {
99
+ 'Content-Type': 'application/json'
100
+ }
101
+ response = requests.request("POST", url, headers=headers, data=payload, stream=True)
102
+ buffer = ""
103
+ for line in response.iter_lines():
104
+ if len(line) == 0: continue
105
+ try:
106
+ dec = line.decode().lstrip('data:')
107
+ dec = json.loads(dec)
108
+ incoming = dec['result']
109
+ buffer += incoming
110
+ yield buffer
111
+ except:
112
+ if ('error_code' in dec) and ("max length" in dec['error_msg']):
113
+ raise ConnectionAbortedError(dec['error_msg']) # 上下文太长导致 token 溢出
114
+ elif ('error_code' in dec):
115
+ raise RuntimeError(dec['error_msg'])
116
+
117
+
118
+ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=[], console_slience=False):
119
+ """
120
+ ⭐多线程方法
121
+ 函数的说明请见 request_llm/bridge_all.py
122
+ """
123
+ watch_dog_patience = 5
124
+ response = ""
125
+
126
+ for response in generate_from_baidu_qianfan(inputs, llm_kwargs, history, sys_prompt):
127
+ if len(observe_window) >= 1:
128
+ observe_window[0] = response
129
+ if len(observe_window) >= 2:
130
+ if (time.time()-observe_window[1]) > watch_dog_patience: raise RuntimeError("程序终止。")
131
+ return response
132
+
133
+ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_prompt='', stream = True, additional_fn=None):
134
+ """
135
+ ⭐单线程方法
136
+ 函数的说明请见 request_llm/bridge_all.py
137
+ """
138
+ chatbot.append((inputs, ""))
139
+
140
+ if additional_fn is not None:
141
+ from core_functional import handle_core_functionality
142
+ inputs, history = handle_core_functionality(additional_fn, inputs, history, chatbot)
143
+
144
+ yield from update_ui(chatbot=chatbot, history=history)
145
+ # 开始接收回复
146
+ try:
147
+ for response in generate_from_baidu_qianfan(inputs, llm_kwargs, history, system_prompt):
148
+ chatbot[-1] = (inputs, response)
149
+ yield from update_ui(chatbot=chatbot, history=history)
150
+ except ConnectionAbortedError as e:
151
+ from .bridge_all import model_info
152
+ if len(history) >= 2: history[-1] = ""; history[-2] = "" # 清除当前溢出的输入:history[-2] 是本次输入, history[-1] 是本次输出
153
+ history = clip_history(inputs=inputs, history=history, tokenizer=model_info[llm_kwargs['llm_model']]['tokenizer'],
154
+ max_token_limit=(model_info[llm_kwargs['llm_model']]['max_token'])) # history至少释放二分之一
155
+ chatbot[-1] = (chatbot[-1][0], "[Local Message] Reduce the length. 本次输入过长, 或历史数据过长. 历史缓存数据已部分释放, 您可以请再次尝试. (若再次失败则更可能是因为输入过长.)")
156
+ yield from update_ui(chatbot=chatbot, history=history, msg="异常") # 刷新界面
157
+ return
158
+
159
+ # 总结输出
160
+ response = f"[Local Message]: {model_name}响应异常 ..."
161
+ if response == f"[Local Message]: 等待{model_name}响应中 ...":
162
+ response = f"[Local Message]: {model_name}响应异常 ..."
163
+ history.extend([inputs, response])
164
+ yield from update_ui(chatbot=chatbot, history=history)
request_llm/local_llm_class.py CHANGED
@@ -128,7 +128,7 @@ def get_local_llm_predict_fns(LLMSingletonClass, model_name):
128
 
129
  # chatglm 没有 sys_prompt 接口,因此把prompt加入 history
130
  history_feedin = []
131
- history_feedin.append(["What can I do?", sys_prompt])
132
  for i in range(len(history)//2):
133
  history_feedin.append([history[2*i], history[2*i+1]] )
134
 
@@ -161,7 +161,7 @@ def get_local_llm_predict_fns(LLMSingletonClass, model_name):
161
 
162
  # 处理历史信息
163
  history_feedin = []
164
- history_feedin.append(["What can I do?", system_prompt] )
165
  for i in range(len(history)//2):
166
  history_feedin.append([history[2*i], history[2*i+1]] )
167
 
 
128
 
129
  # chatglm 没有 sys_prompt 接口,因此把prompt加入 history
130
  history_feedin = []
131
+ history_feedin.append([sys_prompt, "Certainly!"])
132
  for i in range(len(history)//2):
133
  history_feedin.append([history[2*i], history[2*i+1]] )
134
 
 
161
 
162
  # 处理历史信息
163
  history_feedin = []
164
+ history_feedin.append([system_prompt, "Certainly!"])
165
  for i in range(len(history)//2):
166
  history_feedin.append([history[2*i], history[2*i+1]] )
167
 
requirements.txt CHANGED
@@ -17,5 +17,6 @@ openai
17
  numpy
18
  arxiv
19
  rich
20
- websocket-client
21
  pypdf2==2.12.1
 
 
 
17
  numpy
18
  arxiv
19
  rich
 
20
  pypdf2==2.12.1
21
+ websocket-client
22
+ scipdf_parser==0.3
tests/test_plugins.py CHANGED
@@ -9,9 +9,9 @@ validate_path() # 返回项目根路径
9
  from tests.test_utils import plugin_test
10
 
11
  if __name__ == "__main__":
12
- plugin_test(plugin='crazy_functions.命令行助手->命令行助手', main_input='查看当前的docker容器列表')
13
 
14
- plugin_test(plugin='crazy_functions.解析项目源代码->解析一个Python项目', main_input="crazy_functions/test_project/python/dqn")
15
 
16
  # plugin_test(plugin='crazy_functions.解析项目源代码->解析一个C项目', main_input="crazy_functions/test_project/cpp/cppipc")
17
 
@@ -19,7 +19,7 @@ if __name__ == "__main__":
19
 
20
  # plugin_test(plugin='crazy_functions.批量Markdown翻译->Markdown中译英', main_input="README.md")
21
 
22
- # plugin_test(plugin='crazy_functions.批量翻译PDF文档_多线程->批量翻译PDF文档', main_input="crazy_functions/test_project/pdf_and_word")
23
 
24
  # plugin_test(plugin='crazy_functions.谷歌检索小助手->谷歌检索小助手', main_input="https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=auto+reinforcement+learning&btnG=")
25
 
 
9
  from tests.test_utils import plugin_test
10
 
11
  if __name__ == "__main__":
12
+ # plugin_test(plugin='crazy_functions.命令行助手->命令行助手', main_input='查看当前的docker容器列表')
13
 
14
+ # plugin_test(plugin='crazy_functions.解析项目源代码->解析一个Python项目', main_input="crazy_functions/test_project/python/dqn")
15
 
16
  # plugin_test(plugin='crazy_functions.解析项目源代码->解析一个C项目', main_input="crazy_functions/test_project/cpp/cppipc")
17
 
 
19
 
20
  # plugin_test(plugin='crazy_functions.批量Markdown翻译->Markdown中译英', main_input="README.md")
21
 
22
+ plugin_test(plugin='crazy_functions.批量翻译PDF文档_多线程->批量翻译PDF文档', main_input='crazy_functions/test_project/pdf_and_word/aaai.pdf')
23
 
24
  # plugin_test(plugin='crazy_functions.谷歌检索小助手->谷歌检索小助手', main_input="https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=auto+reinforcement+learning&btnG=")
25
 
tests/test_utils.py CHANGED
@@ -22,10 +22,12 @@ def silence_stdout(func):
22
  def wrapper(*args, **kwargs):
23
  _original_stdout = sys.stdout
24
  sys.stdout = open(os.devnull, 'w')
 
25
  for q in func(*args, **kwargs):
26
  sys.stdout = _original_stdout
27
  yield q
28
  sys.stdout = open(os.devnull, 'w')
 
29
  sys.stdout.close()
30
  sys.stdout = _original_stdout
31
  return wrapper
@@ -35,6 +37,7 @@ def silence_stdout_fn(func):
35
  def wrapper(*args, **kwargs):
36
  _original_stdout = sys.stdout
37
  sys.stdout = open(os.devnull, 'w')
 
38
  result = func(*args, **kwargs)
39
  sys.stdout.close()
40
  sys.stdout = _original_stdout
 
22
  def wrapper(*args, **kwargs):
23
  _original_stdout = sys.stdout
24
  sys.stdout = open(os.devnull, 'w')
25
+ sys.stdout.reconfigure(encoding='utf-8')
26
  for q in func(*args, **kwargs):
27
  sys.stdout = _original_stdout
28
  yield q
29
  sys.stdout = open(os.devnull, 'w')
30
+ sys.stdout.reconfigure(encoding='utf-8')
31
  sys.stdout.close()
32
  sys.stdout = _original_stdout
33
  return wrapper
 
37
  def wrapper(*args, **kwargs):
38
  _original_stdout = sys.stdout
39
  sys.stdout = open(os.devnull, 'w')
40
+ sys.stdout.reconfigure(encoding='utf-8')
41
  result = func(*args, **kwargs)
42
  sys.stdout.close()
43
  sys.stdout = _original_stdout
version CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "version": 3.48,
3
  "show_feature": true,
4
- "new_feature": "接入阿里通义千问、讯飞星火、上海AI-Lab书生 <-> 优化一键升级 <-> 提高arxiv翻译速度和成功率 <-> 支持自定义APIKEY格式 <-> 临时修复theme的文件丢失问题 <-> 新增实时语音对话插件(自动断句,脱手对话) <-> 支持加载自定义的ChatGLM2微调模型 <-> 动态ChatBot窗口高度 <-> 修复Azure接口的BUG <-> 完善多语言模块 <-> 完善本地Latex矫错和翻译功能 <-> 增加gpt-3.5-16k的支持"
5
  }
 
1
  {
2
+ "version": 3.49,
3
  "show_feature": true,
4
+ "new_feature": "支持借助GROBID实现PDF高精度翻译 <-> 接入百度千帆平台和文心一言 <-> 接入阿里通义千问、讯飞星火、上海AI-Lab书生 <-> 优化一键升级 <-> 提高arxiv翻译速度和成功率 <-> 支持自定义APIKEY格式 <-> 临时修复theme的文件丢失问题 <-> 新增实时语音对话插件(自动断句,脱手对话) <-> 支持加载自定义的ChatGLM2微调模型 <-> 动态ChatBot窗口高度 <-> 修复Azure接口的BUG <-> 完善多语言模块"
5
  }