Spaces:

ServiceX
/

PDF

Runtime error

App Files Files Community

BirkhoffLee commited on 8 days ago

Commit

90423c7

unverified ·

1 Parent(s): 37e12f1

feat: 基本上搞定了翻译的页面

Browse files

Files changed (3) hide show

README.md +14 -7
entrypoint.sh +0 -5
src/gateway.py +243 -63

README.md CHANGED Viewed

@@ -12,16 +12,18 @@ pinned: false
 这个仓库部署一个单体 FastAPI 服务，包含：
 1. 用户登录（Session Cookie）
-2. 自研 Web UI（上传 PDF、查看任务、下载结果）
 3. 任务队列（单 worker）调用 `pdf2zh_next` Python API
-4. 内部 OpenAI 兼容代理：`/internal/openai/v1/chat/completions`
 5. 按登录用户名计费（token + USD）
 ### 运行架构
 1. 用户访问 `:7860` 登录并提交翻译任务
-2. 后台 worker 调用 `pdf2zh_next`，并把 OpenAI 请求发到本机内部代理
-3. 内部代理转发到 OpenAI 官方 API，同时记录 token 用量
 4. 计费按 `username` 聚合，前端展示账单
 ### 必需 Secret
@@ -29,7 +31,6 @@ pinned: false
 在 HuggingFace Space 设置：
 - `BASIC_AUTH_USERS`（多行文本）
-- `OPENAI_API_KEY`
 `BASIC_AUTH_USERS` 格式：
@@ -48,11 +49,18 @@ bob:your_password_2
 - `SESSION_SECRET`：Session 签名密钥
 - `INTERNAL_KEY_SALT`：内部 key 生成盐（默认复用 `SESSION_SECRET`）
-- `DEFAULT_OPENAI_MODEL`：默认模型（默认 `gpt-4o-mini`）
 - `DEFAULT_LANG_IN`：默认源语言（默认 `en`）
 - `DEFAULT_LANG_OUT`：默认目标语言（默认 `zh`）
 - `TRANSLATION_QPS`：翻译 QPS（默认 `4`）
 - `DATA_DIR`：数据目录（默认 `/data`）
 ### 健康检查
@@ -64,7 +72,6 @@ bob:your_password_2
 docker build -t pdf2zh-gated .
 docker run --rm -p 7860:7860 \
   -e BASIC_AUTH_USERS=$'alice:pass1\nbob:pass2' \
-  -e OPENAI_API_KEY='sk-your-openai-key' \
   pdf2zh-gated
 ```

 这个仓库部署一个单体 FastAPI 服务，包含：
 1. 用户登录（Session Cookie）
+2. 中文 Web UI（上传 PDF、查看任务、下载结果）
 3. 任务队列（单 worker）调用 `pdf2zh_next` Python API
+4. 内部 OpenAI 兼容代理：`/internal/openai/v1/chat/completions`（按模型路由）
 5. 按登录用户名计费（token + USD）
 ### 运行架构
 1. 用户访问 `:7860` 登录并提交翻译任务
+2. 后台 worker 调用 `pdf2zh_next`，并把 `chat/completions` 请求发到本机内部代理
+3. 内部代理根据 `model` 路由上游：
+   - `SiliconFlowFree` -> 作者维护的 `chatproxy` 接口
+   - 其他模型 -> OpenAI 风格上游（默认 OpenAI 官方）
 4. 计费按 `username` 聚合，前端展示账单
 ### 必需 Secret
 在 HuggingFace Space 设置：
 - `BASIC_AUTH_USERS`（多行文本）
 `BASIC_AUTH_USERS` 格式：
 - `SESSION_SECRET`：Session 签名密钥
 - `INTERNAL_KEY_SALT`：内部 key 生成盐（默认复用 `SESSION_SECRET`）
 - `DEFAULT_LANG_IN`：默认源语言（默认 `en`）
 - `DEFAULT_LANG_OUT`：默认目标语言（默认 `zh`）
 - `TRANSLATION_QPS`：翻译 QPS（默认 `4`）
 - `DATA_DIR`：数据目录（默认 `/data`）
+- `OPENAI_API_KEY`：仅当你希望代理“非 SiliconFlowFree 模型”时需要
+- `OPENAI_UPSTREAM_CHAT_URL`：非 SiliconFlowFree 模型的 OpenAI 风格上游地址
+### 固定模型与路由表
+- 前端不提供模型选择，任务模型固定为 `SiliconFlowFree`
+- 路由表在 `src/gateway.py` 的 `MODEL_ROUTE_TABLE`
+- 当前只维护一条：`SiliconFlowFree`，带两个 `chatproxy` 备用地址
 ### 健康检查
 docker build -t pdf2zh-gated .
 docker run --rm -p 7860:7860 \
   -e BASIC_AUTH_USERS=$'alice:pass1\nbob:pass2' \
   pdf2zh-gated
 ```

entrypoint.sh CHANGED Viewed

@@ -7,11 +7,6 @@ if [[ -z "${RAW_USERS}" ]]; then
     exit 1
 fi
-if [[ -z "${OPENAI_API_KEY:-}" ]]; then
-    echo "[ERROR] OPENAI_API_KEY is required." >&2
-    exit 1
-fi
 echo "[INFO] Starting gateway on :7860"
 cd / && exec /opt/gateway/bin/uvicorn gateway:app \
     --app-dir /src \

     exit 1
 fi
 echo "[INFO] Starting gateway on :7860"
 cd / && exec /opt/gateway/bin/uvicorn gateway:app \
     --app-dir /src \

src/gateway.py CHANGED Viewed

@@ -56,13 +56,25 @@ OPENAI_UPSTREAM_CHAT_URL = os.environ.get(
 )
 OPENAI_REAL_API_KEY = os.environ.get("OPENAI_API_KEY", "").strip()
-DEFAULT_MODEL = os.environ.get("DEFAULT_OPENAI_MODEL", "gpt-4o-mini").strip()
 DEFAULT_LANG_IN = os.environ.get("DEFAULT_LANG_IN", "en").strip()
 DEFAULT_LANG_OUT = os.environ.get("DEFAULT_LANG_OUT", "zh").strip()
 TRANSLATION_QPS = int(os.environ.get("TRANSLATION_QPS", "4"))
 INTERNAL_KEY_SALT = (os.environ.get("INTERNAL_KEY_SALT") or SECRET_KEY).strip()
 # 价格单位：USD / 1M tokens
 DEFAULT_INPUT_PRICE_PER_1M = float(
     os.environ.get("OPENAI_DEFAULT_INPUT_PRICE_PER_1M", "0.15")
@@ -502,11 +514,11 @@ def _enqueue_pending_jobs() -> None:
 # ── 页面模板 ───────────────────────────────────────────────────────────────────
 _LOGIN_HTML = """\
 <!DOCTYPE html>
-<html lang="en">
 <head>
 <meta charset="UTF-8">
 <meta name="viewport" content="width=device-width, initial-scale=1.0">
-<title>Sign In</title>
 <style>
   *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
   body {
@@ -566,15 +578,15 @@ _LOGIN_HTML = """\
 </head>
 <body>
 <div class="card">
-  <h1>Welcome back</h1>
-  <p class="sub">Sign in to continue</p>
   __ERROR_BLOCK__
   <form method="post" action="/login">
-    <label for="u">Username</label>
     <input id="u" type="text" name="username" autocomplete="username" required autofocus>
-    <label for="p">Password</label>
     <input id="p" type="password" name="password" autocomplete="current-password" required>
-    <button type="submit">Sign in</button>
   </form>
 </div>
 </body>
@@ -589,16 +601,15 @@ def _login_page(error: str = "") -> str:
 def _dashboard_page(username: str) -> str:
     safe_user = html.escape(username)
-    safe_model = html.escape(DEFAULT_MODEL)
     safe_lang_in = html.escape(DEFAULT_LANG_IN)
     safe_lang_out = html.escape(DEFAULT_LANG_OUT)
     return f"""<!DOCTYPE html>
-<html lang="en">
 <head>
   <meta charset="UTF-8" />
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-  <title>PDF Translation Console</title>
   <style>
     :root {{
       --bg: #f4f7fb;
@@ -667,53 +678,50 @@ def _dashboard_page(username: str) -> str:
 <div class="wrap">
   <div class="top">
     <div>
-      <h1>PDF Translation Console</h1>
-      <div class="user">Signed in as <strong>{safe_user}</strong></div>
     </div>
-    <div><a href="/logout"><button class="muted">Sign out</button></a></div>
   </div>
   <div class="grid">
     <section class="card">
-      <h2>New Job</h2>
       <form id="jobForm">
-        <label>PDF File</label>
         <input name="file" type="file" accept=".pdf" required />
         <div class="row">
           <div>
-            <label>Source Language</label>
             <input name="lang_in" type="text" value="{safe_lang_in}" required />
           </div>
           <div>
-            <label>Target Language</label>
             <input name="lang_out" type="text" value="{safe_lang_out}" required />
           </div>
         </div>
-        <label>OpenAI Model</label>
-        <input name="model" type="text" value="{safe_model}" required />
         <div style="margin-top: 12px;">
-          <button class="primary" type="submit">Submit Job</button>
         </div>
       </form>
-      <div class="hint">Billing is based on your login user and OpenAI token usage.</div>
       <div id="jobStatus" class="status"></div>
     </section>
     <section class="card">
-      <h2>My Billing</h2>
-      <div id="billingSummary" class="mono">Loading...</div>
       <table>
         <thead>
           <tr>
-            <th>Time (UTC)</th>
-            <th>Model</th>
-            <th>Prompt</th>
-            <th>Completion</th>
-            <th>Total</th>
-            <th>Cost (USD)</th>
           </tr>
         </thead>
         <tbody id="billingBody"></tbody>
@@ -722,24 +730,24 @@ def _dashboard_page(username: str) -> str:
   </div>
   <section class="card" style="margin-top: 14px;">
-    <h2>My Jobs</h2>
     <table>
       <thead>
         <tr>
           <th>ID</th>
-          <th>File</th>
-          <th>Status</th>
-          <th>Progress</th>
-          <th>Model</th>
-          <th>Updated (UTC)</th>
-          <th>Actions</th>
         </tr>
       </thead>
       <tbody id="jobsBody"></tbody>
     </table>
   </section>
-  <div class="foot">Internal OpenAI endpoint is localhost-only and not exposed to end users.</div>
 </div>
 <script>
@@ -767,7 +775,7 @@ async function refreshBilling() {{
   const rows = await apiJson('/api/billing/me/records?limit=20');
   document.getElementById('billingSummary').textContent =
-    `total_tokens=${{summary.total_tokens}} | total_cost_usd=${{Number(summary.total_cost_usd).toFixed(6)}}`;
   const body = document.getElementById('billingBody');
   body.innerHTML = '';
@@ -788,20 +796,31 @@ async function refreshBilling() {{
 function actionButtons(job) {{
   const actions = [];
   if (job.status === 'queued' || job.status === 'running') {{
-    actions.push(`<button class="danger" onclick="cancelJob('${{job.id}}')">Cancel</button>`);
   }}
   if (job.artifact_urls?.mono) {{
-    actions.push(`<a href="${{job.artifact_urls.mono}}"><button class="muted">Mono</button></a>`);
   }}
   if (job.artifact_urls?.dual) {{
-    actions.push(`<a href="${{job.artifact_urls.dual}}"><button class="muted">Dual</button></a>`);
   }}
   if (job.artifact_urls?.glossary) {{
-    actions.push(`<a href="${{job.artifact_urls.glossary}}"><button class="muted">Glossary</button></a>`);
   }}
   return actions.join(' ');
 }}
 async function refreshJobs() {{
   const data = await apiJson('/api/jobs?limit=50');
   const body = document.getElementById('jobsBody');
@@ -812,7 +831,7 @@ async function refreshJobs() {{
     tr.innerHTML = `
       <td class="mono">${{esc(job.id)}}</td>
       <td>${{esc(job.filename)}}</td>
-      <td>${{esc(job.status)}}${{job.error ? ' / ' + esc(job.error) : ''}}</td>
       <td>${{Number(job.progress).toFixed(1)}}%</td>
       <td class="mono">${{esc(job.model)}}</td>
       <td class="mono">${{esc(job.updated_at)}}</td>
@@ -827,23 +846,23 @@ async function cancelJob(jobId) {{
     await apiJson(`/api/jobs/${{jobId}}/cancel`, {{ method: 'POST' }});
     await refreshJobs();
   }} catch (err) {{
-    alert(`Cancel failed: ${{err.message}}`);
   }}
 }}
 document.getElementById('jobForm').addEventListener('submit', async (event) => {{
   event.preventDefault();
   const status = document.getElementById('jobStatus');
-  status.textContent = 'Submitting...';
   const formData = new FormData(event.target);
   try {{
     const created = await apiJson('/api/jobs', {{ method: 'POST', body: formData }});
-    status.textContent = `Job queued: ${{created.job.id}}`;
     event.target.reset();
     await refreshJobs();
   }} catch (err) {{
-    status.textContent = `Submit failed: ${{err.message}}`;
   }}
 }});
@@ -875,7 +894,9 @@ async def _startup() -> None:
     _worker_task = asyncio.create_task(_job_worker(), name="job-worker")
     if not OPENAI_REAL_API_KEY:
-        logger.warning("OPENAI_API_KEY is empty, translation jobs will fail")
     logger.info("Gateway started. Data dir: %s", DATA_DIR)
@@ -935,7 +956,7 @@ async def login(
         return resp
     logger.warning("Login failed: %s", username)
-    return HTMLResponse(_login_page("Invalid username or password."), status_code=401)
 @app.get("/logout")
@@ -993,15 +1014,11 @@ async def api_create_job(
     file: UploadFile = File(...),
     lang_in: str = Form(DEFAULT_LANG_IN),
     lang_out: str = Form(DEFAULT_LANG_OUT),
-    model: str = Form(DEFAULT_MODEL),
     username: str = Depends(_require_user),
 ) -> dict[str, Any]:
     filename = file.filename or "input.pdf"
     if not filename.lower().endswith(".pdf"):
-        raise HTTPException(status_code=400, detail="Only PDF file is allowed")
-    if not model.strip():
-        raise HTTPException(status_code=400, detail="Model is required")
     job_id = uuid.uuid4().hex
     safe_filename = Path(filename).name
@@ -1036,7 +1053,7 @@ async def api_create_job(
             0.0,
             "Queued",
             None,
-            model.strip(),
             lang_in.strip() or DEFAULT_LANG_IN,
             lang_out.strip() or DEFAULT_LANG_OUT,
             0,
@@ -1193,6 +1210,154 @@ def _require_localhost(request: Request) -> None:
         raise HTTPException(status_code=403, detail="Internal endpoint only")
 @app.post("/internal/openai/v1/chat/completions")
 async def internal_openai_chat_completions(request: Request) -> Response:
     _require_localhost(request)
@@ -1202,9 +1367,6 @@ async def internal_openai_chat_completions(request: Request) -> Response:
     if not username:
         raise HTTPException(status_code=401, detail="Invalid internal API key")
-    if not OPENAI_REAL_API_KEY:
-        raise HTTPException(status_code=500, detail="OPENAI_API_KEY is not configured")
     try:
         payload = await request.json()
     except json.JSONDecodeError as exc:
@@ -1216,6 +1378,26 @@ async def internal_openai_chat_completions(request: Request) -> Response:
     if _http_client is None:
         raise HTTPException(status_code=500, detail="HTTP client is not ready")
     headers = {
         "Authorization": f"Bearer {OPENAI_REAL_API_KEY}",
         "Content-Type": "application/json",
@@ -1231,9 +1413,8 @@ async def internal_openai_chat_completions(request: Request) -> Response:
         logger.error("Upstream OpenAI call failed: %s", exc)
         raise HTTPException(status_code=502, detail="Upstream OpenAI request failed") from exc
-    content_type = upstream.headers.get("content-type", "")
     response_json: dict[str, Any] | None = None
     if "application/json" in content_type.lower():
         try:
             response_json = upstream.json()
@@ -1246,7 +1427,6 @@ async def internal_openai_chat_completions(request: Request) -> Response:
         completion_tokens = int(usage.get("completion_tokens") or 0)
         total_tokens = int(usage.get("total_tokens") or (prompt_tokens + completion_tokens))
-        model = str(response_json.get("model") or payload.get("model") or "unknown")
         job_id = _active_job_by_user.get(username)
         _record_usage(

 )
 OPENAI_REAL_API_KEY = os.environ.get("OPENAI_API_KEY", "").strip()
+FIXED_TRANSLATION_MODEL = "SiliconFlowFree"
 DEFAULT_LANG_IN = os.environ.get("DEFAULT_LANG_IN", "en").strip()
 DEFAULT_LANG_OUT = os.environ.get("DEFAULT_LANG_OUT", "zh").strip()
 TRANSLATION_QPS = int(os.environ.get("TRANSLATION_QPS", "4"))
 INTERNAL_KEY_SALT = (os.environ.get("INTERNAL_KEY_SALT") or SECRET_KEY).strip()
+# 模型路由表：模型名 -> 上游配置
+MODEL_ROUTE_TABLE: dict[str, dict[str, Any]] = {
+    "SiliconFlowFree": {
+        "route_type": "chatproxy",
+        "base_urls": [
+            "https://api1.pdf2zh-next.com/chatproxy",
+            "https://api2.pdf2zh-next.com/chatproxy",
+        ],
+        "api_key": "",
+    }
+}
 # 价格单位：USD / 1M tokens
 DEFAULT_INPUT_PRICE_PER_1M = float(
     os.environ.get("OPENAI_DEFAULT_INPUT_PRICE_PER_1M", "0.15")
 # ── 页面模板 ───────────────────────────────────────────────────────────────────
 _LOGIN_HTML = """\
 <!DOCTYPE html>
+<html lang="zh-CN">
 <head>
 <meta charset="UTF-8">
 <meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>登录</title>
 <style>
   *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
   body {
 </head>
 <body>
 <div class="card">
+  <h1>欢迎回来</h1>
+  <p class="sub">请先登录后继续</p>
   __ERROR_BLOCK__
   <form method="post" action="/login">
+    <label for="u">用户名</label>
     <input id="u" type="text" name="username" autocomplete="username" required autofocus>
+    <label for="p">密码</label>
     <input id="p" type="password" name="password" autocomplete="current-password" required>
+    <button type="submit">登录</button>
   </form>
 </div>
 </body>
 def _dashboard_page(username: str) -> str:
     safe_user = html.escape(username)
     safe_lang_in = html.escape(DEFAULT_LANG_IN)
     safe_lang_out = html.escape(DEFAULT_LANG_OUT)
     return f"""<!DOCTYPE html>
+<html lang="zh-CN">
 <head>
   <meta charset="UTF-8" />
   <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+  <title>PDF 翻译控制台</title>
   <style>
     :root {{
       --bg: #f4f7fb;
 <div class="wrap">
   <div class="top">
     <div>
+      <h1>PDF 翻译控制台</h1>
+      <div class="user">当前用户：<strong>{safe_user}</strong></div>
     </div>
+    <div><a href="/logout"><button class="muted">退出登录</button></a></div>
   </div>
   <div class="grid">
     <section class="card">
+      <h2>新建任务</h2>
       <form id="jobForm">
+        <label>PDF 文件</label>
         <input name="file" type="file" accept=".pdf" required />
         <div class="row">
           <div>
+            <label>源语言</label>
             <input name="lang_in" type="text" value="{safe_lang_in}" required />
           </div>
           <div>
+            <label>目标语言</label>
             <input name="lang_out" type="text" value="{safe_lang_out}" required />
           </div>
         </div>
         <div style="margin-top: 12px;">
+          <button class="primary" type="submit">提交任务</button>
         </div>
       </form>
+      <div class="hint">模型由后台固定为 SiliconFlowFree，用户无需选择。</div>
       <div id="jobStatus" class="status"></div>
     </section>
     <section class="card">
+      <h2>我的账单</h2>
+      <div id="billingSummary" class="mono">加载中...</div>
       <table>
         <thead>
           <tr>
+            <th>时间 (UTC)</th>
+            <th>模型</th>
+            <th>输入</th>
+            <th>输出</th>
+            <th>总计</th>
+            <th>费用 (USD)</th>
           </tr>
         </thead>
         <tbody id="billingBody"></tbody>
   </div>
   <section class="card" style="margin-top: 14px;">
+    <h2>我的任务</h2>
     <table>
       <thead>
         <tr>
           <th>ID</th>
+          <th>文件</th>
+          <th>状态</th>
+          <th>进度</th>
+          <th>模型</th>
+          <th>更新时间 (UTC)</th>
+          <th>操作</th>
         </tr>
       </thead>
       <tbody id="jobsBody"></tbody>
     </table>
   </section>
+  <div class="foot">内部 OpenAI 接口仅允许 localhost 访问，不会直接暴露给终端用户。</div>
 </div>
 <script>
   const rows = await apiJson('/api/billing/me/records?limit=20');
   document.getElementById('billingSummary').textContent =
+    `总 tokens=${{summary.total_tokens}} | 总费用(USD)=${{Number(summary.total_cost_usd).toFixed(6)}}`;
   const body = document.getElementById('billingBody');
   body.innerHTML = '';
 function actionButtons(job) {{
   const actions = [];
   if (job.status === 'queued' || job.status === 'running') {{
+    actions.push(`<button class="danger" onclick="cancelJob('${{job.id}}')">取消</button>`);
   }}
   if (job.artifact_urls?.mono) {{
+    actions.push(`<a href="${{job.artifact_urls.mono}}"><button class="muted">单语版</button></a>`);
   }}
   if (job.artifact_urls?.dual) {{
+    actions.push(`<a href="${{job.artifact_urls.dual}}"><button class="muted">双语版</button></a>`);
   }}
   if (job.artifact_urls?.glossary) {{
+    actions.push(`<a href="${{job.artifact_urls.glossary}}"><button class="muted">术语表</button></a>`);
   }}
   return actions.join(' ');
 }}
+function statusText(status) {{
+  const statusMap = {{
+    queued: '排队中',
+    running: '进行中',
+    succeeded: '成功',
+    failed: '失败',
+    cancelled: '已取消'
+  }};
+  return statusMap[status] || status;
+}}
 async function refreshJobs() {{
   const data = await apiJson('/api/jobs?limit=50');
   const body = document.getElementById('jobsBody');
     tr.innerHTML = `
       <td class="mono">${{esc(job.id)}}</td>
       <td>${{esc(job.filename)}}</td>
+      <td>${{esc(statusText(job.status))}}${{job.error ? ' / ' + esc(job.error) : ''}}</td>
       <td>${{Number(job.progress).toFixed(1)}}%</td>
       <td class="mono">${{esc(job.model)}}</td>
       <td class="mono">${{esc(job.updated_at)}}</td>
     await apiJson(`/api/jobs/${{jobId}}/cancel`, {{ method: 'POST' }});
     await refreshJobs();
   }} catch (err) {{
+    alert(`取消失败: ${{err.message}}`);
   }}
 }}
 document.getElementById('jobForm').addEventListener('submit', async (event) => {{
   event.preventDefault();
   const status = document.getElementById('jobStatus');
+  status.textContent = '提交中...';
   const formData = new FormData(event.target);
   try {{
     const created = await apiJson('/api/jobs', {{ method: 'POST', body: formData }});
+    status.textContent = `任务已入队: ${{created.job.id}}`;
     event.target.reset();
     await refreshJobs();
   }} catch (err) {{
+    status.textContent = `提交失败: ${{err.message}}`;
   }}
 }});
     _worker_task = asyncio.create_task(_job_worker(), name="job-worker")
     if not OPENAI_REAL_API_KEY:
+        logger.info(
+            "OPENAI_API_KEY is empty, non-routed OpenAI models will fail"
+        )
     logger.info("Gateway started. Data dir: %s", DATA_DIR)
         return resp
     logger.warning("Login failed: %s", username)
+    return HTMLResponse(_login_page("用户名或密码错误。"), status_code=401)
 @app.get("/logout")
     file: UploadFile = File(...),
     lang_in: str = Form(DEFAULT_LANG_IN),
     lang_out: str = Form(DEFAULT_LANG_OUT),
     username: str = Depends(_require_user),
 ) -> dict[str, Any]:
     filename = file.filename or "input.pdf"
     if not filename.lower().endswith(".pdf"):
+        raise HTTPException(status_code=400, detail="仅支持 PDF 文件")
     job_id = uuid.uuid4().hex
     safe_filename = Path(filename).name
             0.0,
             "Queued",
             None,
+            FIXED_TRANSLATION_MODEL,
             lang_in.strip() or DEFAULT_LANG_IN,
             lang_out.strip() or DEFAULT_LANG_OUT,
             0,
         raise HTTPException(status_code=403, detail="Internal endpoint only")
+def _extract_text_from_message_content(content: Any) -> str:
+    if isinstance(content, str):
+        return content
+    if not isinstance(content, list):
+        return ""
+    parts: list[str] = []
+    for item in content:
+        if not isinstance(item, dict):
+            continue
+        if item.get("type") != "text":
+            continue
+        text = item.get("text")
+        if isinstance(text, str):
+            parts.append(text)
+    return "".join(parts)
+def _extract_text_from_messages(messages: Any) -> str:
+    if not isinstance(messages, list):
+        raise HTTPException(status_code=400, detail="messages must be a list")
+    for message in reversed(messages):
+        if not isinstance(message, dict):
+            continue
+        if message.get("role") != "user":
+            continue
+        text = _extract_text_from_message_content(message.get("content"))
+        if text:
+            return text
+    for message in reversed(messages):
+        if not isinstance(message, dict):
+            continue
+        text = _extract_text_from_message_content(message.get("content"))
+        if text:
+            return text
+    raise HTTPException(status_code=400, detail="messages does not contain text content")
+def _should_request_json_mode(payload: dict[str, Any]) -> bool:
+    response_format = payload.get("response_format")
+    if not isinstance(response_format, dict):
+        return False
+    return response_format.get("type") == "json_object"
+def _build_openai_compatible_response(model: str, content: str) -> dict[str, Any]:
+    return {
+        "id": f"chatcmpl-{uuid.uuid4().hex}",
+        "object": "chat.completion",
+        "created": int(datetime.now(timezone.utc).timestamp()),
+        "model": model,
+        "choices": [
+            {
+                "index": 0,
+                "message": {
+                    "role": "assistant",
+                    "content": content,
+                },
+                "finish_reason": "stop",
+            }
+        ],
+        "usage": {
+            "prompt_tokens": 0,
+            "completion_tokens": 0,
+            "total_tokens": 0,
+        },
+    }
+async def _forward_to_chatproxy(
+    payload: dict[str, Any],
+    model: str,
+    route: dict[str, Any],
+) -> dict[str, Any]:
+    if _http_client is None:
+        raise HTTPException(status_code=500, detail="HTTP client is not ready")
+    base_urls = route.get("base_urls", [])
+    if not isinstance(base_urls, list) or not base_urls:
+        raise HTTPException(status_code=500, detail=f"No upstream configured for model {model}")
+    request_json = {
+        "text": _extract_text_from_messages(payload.get("messages")),
+    }
+    if _should_request_json_mode(payload):
+        request_json["requestJsonMode"] = True
+    api_key = str(route.get("api_key") or "").strip()
+    headers = {"Content-Type": "application/json"}
+    if api_key:
+        headers["Authorization"] = f"Bearer {api_key}"
+    last_error = "No available upstream"
+    for base_url in base_urls:
+        try:
+            upstream = await _http_client.post(
+                str(base_url),
+                headers=headers,
+                json=request_json,
+            )
+        except httpx.HTTPError as exc:
+            last_error = str(exc)
+            logger.warning("chatproxy call failed: model=%s url=%s error=%s", model, base_url, exc)
+            continue
+        if upstream.status_code >= 400:
+            last_error = f"status={upstream.status_code}"
+            logger.warning(
+                "chatproxy upstream returned error: model=%s url=%s status=%s",
+                model,
+                base_url,
+                upstream.status_code,
+            )
+            continue
+        try:
+            body = upstream.json()
+        except Exception as exc:  # noqa: BLE001
+            last_error = f"invalid json response: {exc}"
+            logger.warning(
+                "chatproxy upstream returned invalid json: model=%s url=%s",
+                model,
+                base_url,
+            )
+            continue
+        content = body.get("content")
+        if not isinstance(content, str):
+            last_error = "missing content field"
+            logger.warning(
+                "chatproxy upstream missing content: model=%s url=%s body=%s",
+                model,
+                base_url,
+                body,
+            )
+            continue
+        return _build_openai_compatible_response(model=model, content=content)
+    raise HTTPException(
+        status_code=502,
+        detail=f"All chatproxy upstreams failed for model {model}: {last_error}",
+    )
 @app.post("/internal/openai/v1/chat/completions")
 async def internal_openai_chat_completions(request: Request) -> Response:
     _require_localhost(request)
     if not username:
         raise HTTPException(status_code=401, detail="Invalid internal API key")
     try:
         payload = await request.json()
     except json.JSONDecodeError as exc:
     if _http_client is None:
         raise HTTPException(status_code=500, detail="HTTP client is not ready")
+    model = str(payload.get("model") or "").strip()
+    if not model:
+        raise HTTPException(status_code=400, detail="model is required")
+    route = MODEL_ROUTE_TABLE.get(model)
+    if route and route.get("route_type") == "chatproxy":
+        response_json = await _forward_to_chatproxy(payload=payload, model=model, route=route)
+        _record_usage(
+            username=username,
+            job_id=_active_job_by_user.get(username),
+            model=model,
+            prompt_tokens=0,
+            completion_tokens=0,
+            total_tokens=0,
+        )
+        return JSONResponse(response_json, status_code=200)
+    if not OPENAI_REAL_API_KEY:
+        raise HTTPException(status_code=500, detail="OPENAI_API_KEY is not configured")
     headers = {
         "Authorization": f"Bearer {OPENAI_REAL_API_KEY}",
         "Content-Type": "application/json",
         logger.error("Upstream OpenAI call failed: %s", exc)
         raise HTTPException(status_code=502, detail="Upstream OpenAI request failed") from exc
     response_json: dict[str, Any] | None = None
+    content_type = upstream.headers.get("content-type", "")
     if "application/json" in content_type.lower():
         try:
             response_json = upstream.json()
         completion_tokens = int(usage.get("completion_tokens") or 0)
         total_tokens = int(usage.get("total_tokens") or (prompt_tokens + completion_tokens))
         job_id = _active_job_by_user.get(username)
         _record_usage(