lily_fast_api / README_huggingface.md
gbrabbit's picture
Fresh start for HF Spaces deployment
526927a

Lily LLM API - Hugging Face Spaces

๐Ÿค– ์†Œ๊ฐœ

Lily LLM API๋Š” ๋‹ค์ค‘ ๋ชจ๋ธ ์ง€์›๊ณผ RAG(Retrieval Augmented Generation) ์‹œ์Šคํ…œ์„ ๊ฐ–์ถ˜ ๊ณ ์„ฑ๋Šฅ AI API ์„œ๋ฒ„์ž…๋‹ˆ๋‹ค.

โœจ ์ฃผ์š” ๊ธฐ๋Šฅ

  • ๐Ÿง  ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ AI: Kanana-1.5-v-3b-instruct ๋ชจ๋ธ์„ ํ†ตํ•œ ํ…์ŠคํŠธ ๋ฐ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ
  • ๐Ÿ“š RAG ์‹œ์Šคํ…œ: ๋ฌธ์„œ ๊ธฐ๋ฐ˜ ์งˆ์˜์‘๋‹ต ๋ฐ ์ปจํ…์ŠคํŠธ ๊ฒ€์ƒ‰
  • ๐Ÿ” ๋ฒกํ„ฐ ๊ฒ€์ƒ‰: FAISS ๊ธฐ๋ฐ˜ ๊ณ ์† ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰
  • ๐Ÿ“„ ๋ฌธ์„œ ์ฒ˜๋ฆฌ: PDF, DOCX, TXT ๋“ฑ ๋‹ค์–‘ํ•œ ๋ฌธ์„œ ํ˜•์‹ ์ง€์›
  • ๐Ÿ–ผ๏ธ ์ด๋ฏธ์ง€ OCR: LaTeX-OCR์„ ํ†ตํ•œ ์ˆ˜ํ•™ ๊ณต์‹ ์ธ์‹
  • โšก ๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ: Celery ๊ธฐ๋ฐ˜ ๋ฐฑ๊ทธ๋ผ์šด๋“œ ์ž‘์—…
  • ๐ŸŒ RESTful API: FastAPI ๊ธฐ๋ฐ˜ ๊ณ ์„ฑ๋Šฅ ์›น API

๐Ÿš€ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•

1. ํ…์ŠคํŠธ ์ƒ์„ฑ

import requests

response = requests.post(
    "https://your-space-url/generate",
    data={"prompt": "์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜ ๋‚ ์”จ๊ฐ€ ์–ด๋–ค๊ฐ€์š”?"}
)
print(response.json())

2. ์ด๋ฏธ์ง€์™€ ํ•จ๊ป˜ ์งˆ์˜

import requests

with open("image.jpg", "rb") as f:
    response = requests.post(
        "https://your-space-url/generate",
        data={"prompt": "์ด๋ฏธ์ง€์—์„œ ๋ฌด์—‡์„ ๋ณผ ์ˆ˜ ์žˆ๋‚˜์š”?"},
        files={"image1": f}
    )
print(response.json())

3. RAG ๊ธฐ๋ฐ˜ ์งˆ์˜์‘๋‹ต

import requests

# ๋ฌธ์„œ ์—…๋กœ๋“œ
with open("document.pdf", "rb") as f:
    upload_response = requests.post(
        "https://your-space-url/upload-document",
        files={"file": f},
        data={"user_id": "your_user_id"}
    )

document_id = upload_response.json()["document_id"]

# RAG ์งˆ์˜
response = requests.post(
    "https://your-space-url/rag-query",
    json={
        "query": "๋ฌธ์„œ์˜ ์ฃผ์š” ๋‚ด์šฉ์€ ๋ฌด์—‡์ธ๊ฐ€์š”?",
        "user_id": "your_user_id",
        "document_id": document_id
    }
)
print(response.json())

๐Ÿ“‹ API ์—”๋“œํฌ์ธํŠธ

๊ธฐ๋ณธ ์—”๋“œํฌ์ธํŠธ

  • GET /health - ์„œ๋ฒ„ ์ƒํƒœ ํ™•์ธ
  • GET /models - ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ ๋ชฉ๋ก
  • POST /load-model - ๋ชจ๋ธ ๋กœ๋“œ
  • POST /generate - ํ…์ŠคํŠธ/์ด๋ฏธ์ง€ ์ƒ์„ฑ

RAG ์‹œ์Šคํ…œ

  • POST /upload-document - ๋ฌธ์„œ ์—…๋กœ๋“œ
  • POST /rag-query - RAG ๊ธฐ๋ฐ˜ ์งˆ์˜
  • GET /documents/{user_id} - ์‚ฌ์šฉ์ž ๋ฌธ์„œ ๋ชฉ๋ก
  • DELETE /document/{document_id} - ๋ฌธ์„œ ์‚ญ์ œ

๊ณ ๊ธ‰ ๊ธฐ๋Šฅ

  • POST /batch-process - ๋ฐฐ์น˜ ๋ฌธ์„œ ์ฒ˜๋ฆฌ
  • GET /task-status/{task_id} - ์ž‘์—… ์ƒํƒœ ํ™•์ธ
  • POST /cancel-task/{task_id} - ์ž‘์—… ์ทจ์†Œ

๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ

  • Backend: FastAPI, Python 3.11
  • AI Models: Transformers, PyTorch
  • Vector DB: FAISS, ChromaDB
  • Task Queue: Celery, Redis
  • OCR: LaTeX-OCR, EasyOCR
  • Document Processing: LangChain

๐Ÿ“Š ๋ชจ๋ธ ์ •๋ณด

Kanana-1.5-v-3b-instruct

  • ํฌ๊ธฐ: 3.6B ๋งค๊ฐœ๋ณ€์ˆ˜
  • ์–ธ์–ด: ํ•œ๊ตญ์–ด ํŠนํ™”
  • ๊ธฐ๋Šฅ: ํ…์ŠคํŠธ ์ƒ์„ฑ, ์ด๋ฏธ์ง€ ์ดํ•ด
  • ์ปจํ…์ŠคํŠธ: ์ตœ๋Œ€ 4096 ํ† ํฐ

๐Ÿ”ง ์„ค์ •

ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ํ†ตํ•ด ๋‹ค์Œ ์„ค์ •์„ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

# ์„œ๋ฒ„ ์„ค์ •
HOST=0.0.0.0
PORT=7860

# ๋ชจ๋ธ ์„ค์ •
DEFAULT_MODEL=kanana-1.5-v-3b-instruct
MAX_NEW_TOKENS=256
TEMPERATURE=0.7

# ์บ์‹œ ์„ค์ •
TRANSFORMERS_CACHE=/app/cache/transformers
HF_HOME=/app/cache/huggingface

๐Ÿ“ ๋ผ์ด์„ ์Šค

์ด ํ”„๋กœ์ ํŠธ๋Š” MIT ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค.

๐Ÿค ๊ธฐ์—ฌ

๋ฒ„๊ทธ ๋ฆฌํฌํŠธ, ๊ธฐ๋Šฅ ์ œ์•ˆ, ํ’€ ๋ฆฌํ€˜์ŠคํŠธ๋ฅผ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค!

๐Ÿ“ž ์ง€์›

๋ฌธ์˜์‚ฌํ•ญ์ด ์žˆ์œผ์‹œ๋ฉด GitHub Issues๋ฅผ ํ†ตํ•ด ์—ฐ๋ฝํ•ด ์ฃผ์„ธ์š”.