VELA / README.md
intrect's picture
Update README.md
07d2ac1 verified
metadata
license: apache-2.0
language:
  - ko
  - en
library_name: transformers
tags:
  - finance
  - korean
  - stock-analysis
  - reasoning
  - dpo
  - gguf
  - llama-cpp
  - mlx
base_model: Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation

VELA (Vector-Encoded Learning Agent)

GitHub Hugging Face Model Hugging Face Spaces

Version History

Version Date Changes
v1.2 2026-02-16 SFT v3 (58K) Gap Fill 12์นดํ…Œ๊ณ ๋ฆฌ, Markdown RT, ๋ฒค์น˜๋งˆํฌ ์ถ”๊ฐ€
v1.1 2026-02-12 GGUF ์–‘์žํ™” ๋ชจ๋ธ ์ถ”๊ฐ€ (Q4_K_M, Q8_0)
v1.0 2026-01-28 DPO ๋ณ‘ํ•ฉ, ์ค‘๊ตญ์–ด/์˜์–ด leak ํ•ด๊ฒฐ
v0.9 2026-01-15 SFT ๋ฒ ์ด์Šค ๋ชจ๋ธ ๊ณต๊ฐœ

ํ•œ๊ตญ ์ฃผ์‹์‹œ์žฅ ์ „๋ฌธ AI ์• ๋„๋ฆฌ์ŠคํŠธ

VELA๋Š” ํ•œ๊ตญ ์ฃผ์‹์‹œ์žฅ ๋‰ด์Šค ๋ถ„์„ ๋ฐ ํˆฌ์ž ๋ฆฌ์„œ์น˜๋ฅผ ์œ„ํ•ด ํŠนํ™”๋œ 7B ํŒŒ๋ผ๋ฏธํ„ฐ ์–ธ์–ด ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. KOSPI/KOSDAQ 2,135๊ฐœ ์ข…๋ชฉ์— ๋Œ€ํ•œ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„, ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ ํ•ด์„, Reasoning Trace ๊ธฐ๋ฐ˜ ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์ง€์†์ ์ธ ๊ฐœ๋ฐœ๊ณผ ํ•™์Šต์œผ๋กœ ๊ฐœ์„ ํ•ด๋‚˜๊ฐ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ƒ์˜ ์ด

58K+ SFT ์ƒ˜ํ”Œ๊ณผ 26K+ DPO ํŽ˜์–ด๋กœ ํ•™์Šตํ•˜์—ฌ, ํ•œ๊ตญ์–ด ๊ธˆ์œต ๋„๋ฉ”์ธ์—์„œ ์ •ํ™•ํ•˜๊ณ  ๊ตฌ์กฐํ™”๋œ ๋ถ„์„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

Agent Framework: github.com/Intrect-io/vela-framework โ€” pip install vela-framework


Model Details

ํ•ญ๋ชฉ ๋‚ด์šฉ
Base Model Qwen/Qwen2.5-7B-Instruct
Training SFT (58,206 samples) + DPO (26,421 pairs)
Parameters 7.6B
Context Length 8,192 tokens
RT Format Markdown Reasoning Trace
Stock Coverage 2,135 ์ข…๋ชฉ (KOSPI + KOSDAQ)
License Apache 2.0

Available Formats

Format File Size Use Case
BF16 (safetensors) model.safetensors 15 GB Full precision, GPU inference
GGUF Q8_0 vela-q8_0.gguf 7.6 GB High quality quantized, GPU/CPU
GGUF Q4_K_M vela-q4_k_m.gguf 4.4 GB Fast & lightweight, GPU/CPU

MLX 4-bit ์–‘์žํ™” ๋ชจ๋ธ๋„ ๋ณ„๋„ ์ œ๊ณต ์˜ˆ์ • (Apple Silicon ์ตœ์ ํ™”)

Recommended Inference Settings

VELA์˜ generation_config.json์€ llama-cpp-python ์„œ๋ฒ„์™€ ๋™์ผํ•œ ์ƒ˜ํ”Œ๋ง ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๋ฐฑ์—”๋“œ์—์„œ ๋™์ผํ•œ ์ถœ๋ ฅ ํ’ˆ์งˆ์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด ์•„๋ž˜ ์„ค์ •์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค:

Parameter Value Note
temperature 0.7 ์ฐฝ์˜์„ฑ-์ผ๊ด€์„ฑ ๊ท ํ˜•์ 
top_k 40 ์ƒ์œ„ 40๊ฐœ ํ† ํฐ ํ›„๋ณด (llama.cpp default)
top_p 0.95 ๋ˆ„์  ํ™•๋ฅ  95% ์ด๋‚ด ํ† ํฐ ์‚ฌ์šฉ
repetition_penalty 1.0 ๋น„ํ™œ์„ฑํ™” (ํ›„์ฒ˜๋ฆฌ๋กœ ๋ฐ˜๋ณต ์ œ์–ด)
max_tokens 1024-2048 ๋ถ„์„ ๋ณต์žก๋„์— ๋”ฐ๋ผ ์กฐ์ ˆ

์ฃผ์˜์‚ฌํ•ญ:

  • repetition_penalty โ‰ฅ 1.2๋Š” ์‚ฌ์šฉ ๊ธˆ์ง€ โ€” Qwen 7B ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์—์„œ ์ค‘๊ตญ์–ด text leak ๋ฐ ํ™˜๊ฐ(hallucination)์„ ์œ ๋ฐœํ•ฉ๋‹ˆ๋‹ค
  • top_k < 20 ๋˜๋Š” top_p < 0.8์€ ์ถœ๋ ฅ ๋‹ค์–‘์„ฑ์„ ๊ณผ๋„ํ•˜๊ฒŒ ์ œํ•œํ•˜์—ฌ confidence ๊ณ ์ •(50%) ํ˜„์ƒ์„ ๋ฐœ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค
  • ๋ฐ˜๋ณต ์ œ์–ด๋Š” repetition_penalty ๋Œ€์‹  ํ›„์ฒ˜๋ฆฌ ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์ด ์•ˆ์ •์ ์ž…๋‹ˆ๋‹ค

Backend๋ณ„ ์„ค์ • ๊ฐ€์ด๋“œ

llama-cpp-python / Ollama โ€” ๋ณ„๋„ ์„ค์ • ๋ถˆํ•„์š” (๊ธฐ๋ณธ๊ฐ’์ด ๊ถŒ์žฅ๊ฐ’๊ณผ ๋™์ผ):

model.create_chat_completion(
    messages=messages,
    max_tokens=1024,
    temperature=0.7,
    # top_k, top_p, repeat_penalty๋Š” ์„œ๋ฒ„ ๊ธฐ๋ณธ๊ฐ’ ์‚ฌ์šฉ
)

HuggingFace Transformers โ€” generation_config.json์ด ์ž๋™ ๋กœ๋“œ๋จ. ๋ช…์‹œ์  ํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์†Œํ™”:

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    do_sample=True,
    # top_k, top_p, repetition_penalty๋Š” generation_config.json์—์„œ ๋กœ๋“œ
)

vLLM โ€” ๋ช…์‹œ์  ์„ค์ • ๊ถŒ์žฅ:

params = SamplingParams(
    temperature=0.7,
    top_k=40,
    top_p=0.95,
    repetition_penalty=1.0,
    max_tokens=1024,
)

MLX โ€” ์„œ๋ฒ„ ๊ธฐ๋ณธ๊ฐ’ ์‚ฌ์šฉ:

generate(model, tokenizer, prompt=prompt, max_tokens=1024, temp=0.7)

What Can VELA Do?

1. ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„

์ฃผ์‹ ๊ด€๋ จ ๋‰ด์Šค๊ฐ€ ์ฃผ๊ฐ€์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๋‹จ๊ณ„์ ์œผ๋กœ ์ถ”๋ก ํ•ฉ๋‹ˆ๋‹ค.

2. Reasoning Trace (๋‹จ๊ณ„๋ณ„ ์‚ฌ๊ณ  ๊ณผ์ •)

๋ถ„์„ ๊ณผ์ •์„ ํˆฌ๋ช…ํ•˜๊ฒŒ ๋ณด์—ฌ์ฃผ๋Š” Markdown Reasoning Trace๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

**Step 1**:
**Thought**: ์‚ผ์„ฑ์ „์ž 3๋‚˜๋…ธ ์–‘์‚ฐ ์„ฑ๊ณต ๋‰ด์Šค์˜ ๊ธฐ์ˆ ์  ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
**Action**: search
**Query**: ์‚ผ์„ฑ์ „์ž 3๋‚˜๋…ธ ํŒŒ์šด๋“œ๋ฆฌ ์ˆ˜์œจ ๊ฒฝ์Ÿ๋ ฅ TSMC
**Confidence**: 35%

**Step 2**:
**Thought**: TSMC ๋Œ€๋น„ ์ˆ˜์œจ ๊ฒฉ์ฐจ๊ฐ€ ํ•ต์‹ฌ ๋ณ€์ˆ˜์ด๋ฉฐ, ์–‘์‚ฐ ์„ฑ๊ณต์€ ๊ธฐ์ˆ ๋ ฅ ์ž…์ฆ์ด๋‚˜ ์ˆ˜์œจ ์•ˆ์ •ํ™”๊นŒ์ง€ ์‹œ๊ฐ„ ํ•„์š”.
**Action**: analyze
**Confidence**: 65%

**Step 3**:
**Thought**: 3๋‚˜๋…ธ ์–‘์‚ฐ ์„ฑ๊ณต์€ ์ค‘์žฅ๊ธฐ ๊ธ์ • ์‹œ๊ทธ๋„์ด๋‚˜, ๋‹จ๊ธฐ์ ์œผ๋กœ ์ˆ˜์œจ ์ด์Šˆ ๋ฆฌ์Šคํฌ๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
**Action**: conclude
**Confidence**: 80%

3. ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ ํ•ด์„

์• ๋„๋ฆฌ์ŠคํŠธ ๋ฆฌํฌํŠธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ต์‹ฌ ํฌ์ธํŠธ์™€ ํˆฌ์ž ์‹œ์‚ฌ์ ์„ ๋„์ถœํ•ฉ๋‹ˆ๋‹ค.

4. ํˆฌ์ž ๋ฆฌ์„œ์น˜ ๋ฆฌํฌํŠธ

7๊ฐœ ์„น์…˜์œผ๋กœ ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„ ๋ณด๊ณ ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค: Executive Summary / Key Metrics / ์‹œ์žฅ ๋™ํ–ฅ / ์ˆ˜๊ธ‰ ๋ถ„์„ / ๋‰ด์Šค ์˜ํ–ฅ / ๋ฆฌ์Šคํฌ / ํˆฌ์ž ์˜๊ฒฌ

5. ๋„๊ตฌ ํ˜ธ์ถœ (Tool Calling)

Search, Price, Investor ๋“ฑ ์™ธ๋ถ€ ๋„๊ตฌ์™€ ์—ฐ๋™ํ•˜๋Š” ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.


Training Pipeline

Qwen/Qwen2.5-7B-Instruct
        |
        v
   SFT (58,206 samples)
   โ”œโ”€โ”€ ๋‰ด์Šค ๋ถ„๋ฅ˜ ๋ถ„์„      10,830  (18.6%)
   โ”œโ”€โ”€ ๊ทน๋‹จ ์‹œ๊ทธ๋„ ๋ถ„์„      9,603  (16.5%)
   โ”œโ”€โ”€ ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ (GPT-4o) 5,117   (8.8%)
   โ”œโ”€โ”€ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„        4,839   (8.3%)
   โ”œโ”€โ”€ ๋ฉ€ํ‹ฐํ„ด ๋Œ€ํ™”           8,000  (13.8%)
   โ”œโ”€โ”€ Gap Fill (12 ์นดํ…Œ๊ณ ๋ฆฌ) 12,635  (21.7%)
   โ”‚   โ”œโ”€โ”€ ๋ฐธ๋ฅ˜์—์ด์…˜ ๋ถ„์„    2,000
   โ”‚   โ”œโ”€โ”€ ์ˆ˜๊ธ‰/๋ฆฌ์Šคํฌ/EOD    3,000
   โ”‚   โ”œโ”€โ”€ ๊ฑฐ์ ˆ/์œ ๋ณด ์‘๋‹ต     1,000
   โ”‚   โ”œโ”€โ”€ ๊ฐ„๊ฒฐ ๋ถ„์„          1,000
   โ”‚   โ”œโ”€โ”€ ์‹ฌ์ธต ์ถ”๋ก  (5+ steps) 1,000
   โ”‚   โ””โ”€โ”€ ๊ธฐํƒ€ (๋งคํฌ๋กœ, ์„นํ„ฐ ๋“ฑ) 4,635
   โ””โ”€โ”€ ๊ธฐํƒ€                  7,182  (12.3%)
        |
        v
   DPO (26,421 pairs)
   โ”œโ”€โ”€ ์ค‘๋ณต ์ œ๊ฑฐ ๊ธฐ๋ณธ ํŽ˜์–ด   12,000  (45.4%)
   โ”œโ”€โ”€ ๋‹ค๊ตญ์–ด leak ๋ณด๊ฐ•      5,997  (22.7%)
   โ”œโ”€โ”€ VELA ChatML ์ •๋ ฌ      5,000  (18.9%)
   โ”œโ”€โ”€ ๋ถˆ์ถฉ๋ถ„ ๋ถ„์„ ๊ต์ •      1,642   (6.2%)
   โ”œโ”€โ”€ ์ค‘๊ตญ์–ด leak ๊ต์ • v2   1,216   (4.6%)
   โ””โ”€โ”€ Reasoning Trace ์ •๋ ฌ    566   (2.1%)
        |
        v
      VELA v1.2

Training Data Details

SFT v3 (58,206 samples)

Source Samples Ratio Description
classified_news 10,830 18.6% GPT-4o ๋ถ„๋ฅ˜๋œ ๋‰ด์Šค Reasoning Trace
extreme_signals 9,603 16.5% ๊ธ‰๋“ฑ/๊ธ‰๋ฝ ์‹œ๊ทธ๋„ ๋‰ด์Šค ๋ถ„์„
securities_report_gpt4o 5,117 8.8% ์ฆ๊ถŒ์‚ฌ ๋ฆฌํฌํŠธ GPT-4o ์žฌ๊ตฌ์„ฑ
analysis_news 4,839 8.3% ์ผ๋ฐ˜ ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„
multi_turn_2t 4,000 6.9% ๋‹จ์ผ ํ„ด ๋‹ค์–‘ ์ข…๋ชฉ ๋ถ„์„
multi_turn_4t 4,000 6.9% 2ํ„ด follow-up ๋Œ€ํ™”
valuation 2,000 3.4% ๋ฐธ๋ฅ˜์—์ด์…˜ ๋ถ„์„ (v3 Gap Fill)
tool_calling 1,965 3.4% Search/Price/Investor ๋„๊ตฌ ํ˜ธ์ถœ
supply_demand_ext 1,000 1.7% ์ˆ˜๊ธ‰ ํ™•์žฅ ๋ถ„์„ (v3)
risk 1,000 1.7% ๋ฆฌ์Šคํฌ ๋ถ„์„ (v3)
eod_report 1,000 1.7% EOD ์‹œํ™ฉ ๋ฆฌํฌํŠธ (v3)
refusal 1,000 1.7% ๊ฑฐ์ ˆ/์œ ๋ณด ์‘๋‹ต (v3)
short_analysis 1,000 1.7% ๊ฐ„๊ฒฐ ๋ถ„์„ <500์ž (v3)
deep_reasoning 1,000 1.7% ์‹ฌ์ธต ์ถ”๋ก  5+ steps (v3)
low_confidence 1,000 1.7% ์ €ํ™•์‹ ๋„ ๋ถ„์„ (v3)
macro_impact_ext 1,000 1.7% ๊ฑฐ์‹œ๊ฒฝ์ œ ํ™•์žฅ (v3)
sector_theme 1,000 1.7% ์„นํ„ฐ/ํ…Œ๋งˆ ๋ถ„์„ (v3)
multi_stock_comparison 981 1.7% ๋ณต์ˆ˜ ์ข…๋ชฉ ๋น„๊ต ๋ถ„์„
earnings_impact 971 1.7% ์‹ค์  ๋ฐœํ‘œ ์˜ํ–ฅ ๋ถ„์„
risk_alert 948 1.6% ๋ฆฌ์Šคํฌ ๊ฒฝ๊ณ  ๋ถ„์„
null_impact 900 1.5% ์ฃผ๊ฐ€ ๋ฌด์˜ํ–ฅ ์‘๋‹ต (v3)
Other 2,050 3.5% batch5 fallback, ๊ธฐ์กด ์ˆ˜๊ธ‰/์„นํ„ฐ/๋งคํฌ๋กœ

v3์—์„œ 12๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ 12,635๊ฐœ ์ƒ˜ํ”Œ์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ฐ์ดํ„ฐ ๊ฐญ์„ ๋ณด๊ฐ•ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ƒ์„ฑ ๋น„์šฉ: Perplexity Sonar 2K ($4.60) + OpenAI gpt-4o-mini Batch API 10.6K ($2.76) = ~$7.36

DPO v2 (26,421 pairs)

Source Pairs Ratio Rejection Type
dpo_dedup 12,000 45.4% ์งง์€/์ €ํ’ˆ์งˆ ์‘๋‹ต vs ์ƒ์„ธ ๋ถ„์„
multilingual_aug 5,997 22.7% ์ค‘๊ตญ์–ด/์˜์–ด leak, ์งง์€ ์‘๋‹ต, ์ €ํ™•์‹ 
vela_chatml 5,000 18.9% ํ˜•์‹ ์˜ค๋ฅ˜, ์งง์€ ์‘๋‹ต
batch5_insuf_dpo 1,642 6.2% ๋ถˆ์ถฉ๋ถ„ ๋ถ„์„ ํ’ˆ์งˆ ๊ต์ •
chinese_leak_v2 1,216 4.6% ์ค‘๊ตญ์–ด ๋ฌธ์ž leak ์ง‘์ค‘ ๊ต์ •
reasoning_trace_2k 566 2.1% ์˜์–ด leak, RT ํ˜•์‹ ์˜ค๋ฅ˜

Data Version History

Version SFT Samples DPO Pairs Changes
v1.0 36,713 24,779 ์ดˆ๊ธฐ ํ•™์Šต ๋ฐ์ดํ„ฐ (JSON RT)
v1.1 36,713 24,779 RT JSON โ†’ Markdown ๋ณ€ํ™˜
v2.0 45,571 26,421 +๋ฉ€ํ‹ฐํ„ด 8K, +batch5 ๋ถˆ์ถฉ๋ถ„ DPO
v3.0 58,206 26,421 +Gap Fill 12๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ 12,635

Benchmarks

Quantization Benchmark (GGUF)

RTX 3060 12GB, llama-cpp-python, n_gpu_layers=-1, n_ctx=4096

Format Speed Chinese Leak Quality
Q4_K_M 36 tok/s 0/5 CLEAN RT + Report OK
Q8_0 25 tok/s 0/5 CLEAN RT + Report OK

Stress test 5ํšŒ: Synthesis + 3K Reasoning Trace ๊ต๋Œ€ โ€” ์–‘์ชฝ ๋ชจ๋‘ Chinese leak ์ œ๋กœ

MLX Benchmark (Apple Silicon)

M1 Max 32GB, MLX 4-bit ์–‘์žํ™”

Config Quantization Load Time Speed Memory
MLX 4-bit 4-bit (4.5 bpw) 0.59s 15.93 tok/s 4.4 GB
PyTorch (CPU) BF16 0.10s 4.93 tok/s 0.3 GB
PyTorch + LoRA (CPU) BF16 1.64s 4.22 tok/s 14.1 GB

MLX 4-bit vs PyTorch CPU:

  • 3.2x faster inference (15.93 vs 4.93 tok/s)
  • 73% smaller model size (4 GB vs 15 GB)
  • 68% less memory (4.4 vs 14.1 GB)

DPO Quality Improvements

Metric Before DPO After DPO
Chinese leak Frequent 0/10 CLEAN
English leak Occasional Minimal
RT format compliance ~80% ~98%
Korean fluency Good Excellent

Usage

llama-cpp-python (Recommended for GGUF)

from llama_cpp import Llama

model = Llama(
    model_path="vela-q4_k_m.gguf",  # or vela-q8_0.gguf
    n_ctx=4096,
    n_gpu_layers=-1,
    chat_format="chatml",
)

response = model.create_chat_completion(
    messages=[
        {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ์ฃผ์‹ ์ „๋ฌธ ์• ๋„๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."},
        {"role": "user", "content": "์‚ผ์„ฑ์ „์ž HBM ์‚ฌ์—… ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."},
    ],
    max_tokens=1024,
    temperature=0.7,
)
print(response["choices"][0]["message"]["content"])

Transformers (BF16)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "intrect/VELA",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("intrect/VELA")

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ์ฃผ์‹ ์ „๋ฌธ ์• ๋„๋ฆฌ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."},
    {"role": "user", "content": "์‚ผ์„ฑ์ „์ž HBM ์‚ฌ์—… ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM

from vllm import LLM, SamplingParams

llm = LLM(model="intrect/VELA", dtype="bfloat16")
params = SamplingParams(temperature=0.7, max_tokens=1024)

outputs = llm.generate(
    ["์‚ผ์„ฑ์ „์ž HBM ์‹œ์žฅ ์ „๋ง์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."],
    params,
)
print(outputs[0].outputs[0].text)

MLX (Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("intrect/VELA")  # or local MLX 4-bit path

response = generate(
    model,
    tokenizer,
    prompt="์‚ผ์„ฑ์ „์ž 3๋‚˜๋…ธ ์–‘์‚ฐ ๋‰ด์Šค ๋ถ„์„",
    max_tokens=1024,
)
print(response)

Ollama

# Modelfile
FROM ./vela-q4_k_m.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER num_ctx 4096

Output Format

VELA๋Š” ๋‘ ๊ฐ€์ง€ ์ถœ๋ ฅ ๋ชจ๋“œ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

1. Reasoning Trace (๋ถ„์„ ๊ณผ์ •)

Markdown ํ˜•์‹์œผ๋กœ ๋‹จ๊ณ„๋ณ„ ์‚ฌ๊ณ  ๊ณผ์ •์„ ํˆฌ๋ช…ํ•˜๊ฒŒ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค:

**Step 1**:
**Thought**: ์‚ผ์„ฑ์ „์ž HBM3E 12๋‹จ ์–‘์‚ฐ ๊ด€๋ จ ๋‰ด์Šค๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ์ˆ˜์ฃผ ํ˜„ํ™ฉ๊ณผ ์‹œ์žฅ ์ ์œ ์œจ ํŒŒ์•…์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
**Action**: search
**Query**: ์‚ผ์„ฑ์ „์ž HBM3E 12๋‹จ ์ˆ˜์ฃผ ์‹œ์žฅ์ ์œ ์œจ
**Confidence**: 45%

**Step 2**:
**Thought**: SKํ•˜์ด๋‹‰์Šค ๋Œ€๋น„ ์‚ผ์„ฑ์ „์ž์˜ HBM ์‹œ์žฅ ์ ์œ ์œจ ํ™•๋Œ€ ์ถ”์„ธ๋ฅผ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค.
**Action**: analyze
**Confidence**: 70%

**Step 3**:
**Thought**: HBM3E ์–‘์‚ฐ ์„ฑ๊ณต์€ ๊ธ์ •์ ์ด๋‚˜, NVIDIA ์ธ์ฆ ์ง€์—ฐ ๋ฆฌ์Šคํฌ๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
**Action**: conclude
**Confidence**: 82%

2. Synthesis Report (์ตœ์ข… ๋ฆฌํฌํŠธ)

7๊ฐœ ์„น์…˜์œผ๋กœ ๊ตฌ์กฐํ™”๋œ ํˆฌ์ž ๋ถ„์„ ๋ณด๊ณ ์„œ:

# ๋ถ„์„ ๋ฆฌํฌํŠธ: ์‚ผ์„ฑ์ „์ž (005930.KS)

## Executive Summary
[2-3๋ฌธ์žฅ ํ•ต์‹ฌ ์š”์•ฝ]

## Key Metrics
| ์ง€ํ‘œ | ์ˆ˜์น˜ |
|------|------|
| ํ˜„์žฌ๊ฐ€ | 181,200์› |
| 12M FWD PER | 15.4๋ฐฐ |
| PBR | 3.13๋ฐฐ |

## ์‹œ์žฅ ๋™ํ–ฅ ๋ถ„์„
## ์ˆ˜๊ธ‰ ๋ถ„์„
## ๋‰ด์Šค ์˜ํ–ฅ ๋ถ„์„
## ๋ฆฌ์Šคํฌ ์š”์ธ
## ํˆฌ์ž ์˜๊ฒฌ

Example Output

์•„๋ž˜๋Š” VELA ๋ชจ๋ธ์ด ์‹ค์ œ๋กœ ์ƒ์„ฑํ•˜๋Š” ์‘๋‹ต ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค (SFT ํ•™์Šต ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜).

Query: "LGU+ ์˜คํ”ˆ๋žœ ์ƒ์šฉ๋ง ๊ตฌ์ถ• ๋‰ด์Šค๊ฐ€ CS(065770) ์ฃผ๊ฐ€์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ๋ถ„์„ํ•ด์ฃผ์„ธ์š”."

Reasoning Trace (์ถ”๋ก  ๊ณผ์ •) ํŽผ์น˜๊ธฐ
**Step 1**:
**Thought**: CS๋Š” ํ†ต์‹ ์žฅ๋น„ ์ œ์กฐ์—…์ฒด๋กœ, LGU+ ์˜คํ”ˆ๋žœ ์ƒ์šฉ๋ง ๊ตฌ์ถ• ๋‰ด์Šค๋Š”
์ง์ ‘์  ์ˆ˜ํ˜œ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค. ๋‰ด์Šค์—์„œ ์–ธ๊ธ‰๋œ '์‚ผ์ง€์ „์žยท๋…ธํ‚ค์•„'์™€์˜ ํ˜‘๋ ฅ ๊ด€๊ณ„,
๊ธฐ์ˆ  ์‹ค์ฆ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•ด์•ผ ํ•œ๋‹ค. CS์˜ ์˜คํ”ˆ๋žœ ๊ด€๋ จ ํฌ์ง€์…”๋‹ ํ™•์ธ ํ•„์š”.
**Action**: search
**Query**: CS 065770 ์˜คํ”ˆ๋žœ 5G ํ†ต์‹ ์žฅ๋น„ ์‚ฌ์—…
**Confidence**: 40%

**Step 2**:
**Thought**: ๋‰ด์Šค ๋ณธ๋ฌธ์—์„œ '์‚ผ์ง€์ „์žยท๋…ธํ‚ค์•„'๊ฐ€ ์žฅ๋น„ ๊ตฌ์ถ•์„ ๋‹ด๋‹นํ–ˆ๋‹ค๊ณ  ๋ช…์‹œ.
CS๊ฐ€ ์ด ํ”„๋กœ์ ํŠธ์— ์ง์ ‘ ์ฐธ์—ฌํ–ˆ๋Š”์ง€, ์•„๋‹ˆ๋ฉด ๊ฐ„์ ‘ ์ˆ˜ํ˜œ์ธ์ง€ ๊ตฌ๋ถ„์ด ํ•„์ˆ˜.
LGU+ ์˜คํ”ˆ๋žœ ์‹ค์ฆ ํ”„๋กœ์ ํŠธ์˜ ๊ทœ๋ชจ์™€ ํ–ฅํ›„ ์ƒ์šฉํ™” ์ผ์ •๋„ ์ค‘์š”.
**Action**: search
**Query**: LGU+ ์˜คํ”ˆ๋žœ ์‹ค์ฆ๋‹จ์ง€ ์‚ผ์ง€์ „์ž CS ์ฐธ์—ฌ
**Confidence**: 50%

**Step 3**:
**Thought**: ๋‰ด์Šค์—์„œ CS์˜ ์ง์ ‘ ์–ธ๊ธ‰์ด ์—†์œผ๋ฏ€๋กœ, CS๊ฐ€ ์˜คํ”ˆ๋žœ ๋ฐธ๋ฅ˜์ฒด์ธ ๋‚ด ์–ด๋А
์œ„์น˜์— ์žˆ๋Š”์ง€ ๋ถˆ๋ช…ํ™•. ๋‹ค๋งŒ ํ†ต์‹ ์žฅ๋น„ ์ œ์กฐ์‚ฌ๋กœ์„œ ์˜คํ”ˆ๋žœ ์‹œ์žฅ ํ™•๋Œ€๋Š” ์žฅ๊ธฐ ์„ฑ์žฅ ๊ธฐํšŒ.
์‹ค์ œ 1์ผ/3์ผ ์ˆ˜์ต๋ฅ ์ด ๊ฐ๊ฐ +0.73%, +0.87%๋กœ ์™„๋งŒํ•œ ์ƒ์Šน์ด๋ฏ€๋กœ,
๋‰ด์Šค ์ง์ ‘ ์˜ํ–ฅ๋ณด๋‹ค๋Š” ์„นํ„ฐ ๊ธ์ • ์‹ฌ๋ฆฌ ๋ฐ˜์˜์œผ๋กœ ํ•ด์„.
**Action**: analyze
**Confidence**: 65%
Quick Assessment (๋น ๋ฅธ ํŒ์ •) ํŽผ์น˜๊ธฐ
{
  "category": "product_innovation",
  "sentiment": "bullish",
  "impact": "medium",
  "confidence": 0.68,
  "reasoning": "LGU+ ์˜คํ”ˆ๋žœ ์ƒ์šฉ๋ง ๊ตฌ์ถ•์€ ๊ตญ๋‚ด ํ†ต์‹ ์žฅ๋น„ ์‚ฐ์—…์˜ ๊ธฐ์ˆ  ๊ณ ๋„ํ™” ์‹ ํ˜ธ์ด๋‚˜,
    ๋‰ด์Šค์—์„œ CS์˜ ์ง์ ‘ ์ฐธ์—ฌ๊ฐ€ ๋ช…์‹œ๋˜์ง€ ์•Š์Œ. ๊ฐ„์ ‘ ์ˆ˜ํ˜œ ๊ฐ€๋Šฅ์„ฑ๊ณผ ์„นํ„ฐ ๊ธ์ • ์‹ฌ๋ฆฌ๋กœ
    ์ธํ•œ ์™„๋งŒํ•œ ์ƒ์Šน์œผ๋กœ ํ‰๊ฐ€."
}
Analysis Report (์ƒ์„ธ ๋ฆฌํฌํŠธ)

1. Executive Summary

LG์œ ํ”Œ๋Ÿฌ์Šค์˜ ์˜คํ”ˆ๋žœ ์ƒ์šฉ๋ง ๊ตฌ์ถ•์€ ๊ตญ๋‚ด ํ†ต์‹  ์ธํ”„๋ผ์˜ 5G-6G ๊ธฐ์ˆ  ๊ณ ๋„ํ™”๋ฅผ ์‹œ์‚ฌํ•˜๋Š” ๊ธ์ •์  ์‹ ํ˜ธ์ž…๋‹ˆ๋‹ค. ๋‰ด์Šค์—์„œ๋Š” ์‚ผ์ง€์ „์ž์™€ ๋…ธํ‚ค์•„์˜ ์žฅ๋น„ ๊ตฌ์ถ•์„ ๋ช…์‹œํ–ˆ์œผ๋‚˜, CS์˜ ์ง์ ‘ ์ฐธ์—ฌ ์—ฌ๋ถ€๋Š” ๊ณต๊ฐœ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ CS๋Š” ํ†ต์‹ ์žฅ๋น„ ์ œ์กฐ์—…์ฒด๋กœ์„œ ๊ตญ๋‚ด ์˜คํ”ˆ๋žœ ์‹œ์žฅ ํ™•๋Œ€์˜ ์žฅ๊ธฐ ์ˆ˜ํ˜œ์ž๊ฐ€ ๋  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ ์ฃผ๊ฐ€ ๋ฐ˜์‘(1์ผ +0.73%, 3์ผ +0.87%)์€ ์ œํ•œ์ ์ด๋ฉฐ, ์ด๋Š” ๋‰ด์Šค๊ฐ€ ๊ฐœ๋ณ„ ์ข…๋ชฉ๋ณด๋‹ค๋Š” ์„นํ„ฐ ์ „์ฒด์— ๋Œ€ํ•œ ๊ธ์ • ์‹ฌ๋ฆฌ๋ฅผ ๋ฐ˜์˜ํ•œ ๊ฒƒ์œผ๋กœ ํ•ด์„๋ฉ๋‹ˆ๋‹ค. ์ข…ํ•ฉ ์‹ ๋ขฐ๋„: 0.68 (MEDIUM)

2. Key Metrics

์ง€ํ‘œ ์ˆ˜์น˜ ํ‰๊ฐ€
1์ผ ์ˆ˜์ต๋ฅ  +0.73% ์•ฝํ•œ ์ƒ์Šน
3์ผ ์ˆ˜์ต๋ฅ  +0.87% ์ง€์†์  ์•ฝ์„ธ ์ƒ์Šน
๋‰ด์Šค ์˜ํ–ฅ๋ ฅ Medium ์„นํ„ฐ ์˜ํ–ฅ > ๊ฐœ๋ณ„ ์ข…๋ชฉ ์˜ํ–ฅ
์ง์ ‘ ์–ธ๊ธ‰ ์—†์Œ CS ๋ฏธ์–ธ๊ธ‰
๊ธฐ์ˆ  ์นดํ…Œ๊ณ ๋ฆฌ ์˜คํ”ˆ๋žœ(Open RAN) 5G-6G ํ•ต์‹ฌ ๊ธฐ์ˆ 

3. ์‹œ์žฅ ๋™ํ–ฅ ๋ถ„์„

์˜คํ”ˆ๋žœ์€ ๊ธฐ์กด ํ์‡„ํ˜• ํ†ต์‹ ์žฅ๋น„ ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐฉํ˜• ํ‘œ์ค€์œผ๋กœ ์ „ํ™˜ํ•˜๋Š” ๊ตฌ์กฐ์  ๋ณ€ํ™”์ž…๋‹ˆ๋‹ค. LGU+์˜ ๊ธˆ์˜ค๊ณต๊ณผ๋Œ€ํ•™๊ต ์บ ํผ์Šค ๊ตฌ์ถ•์€ ํ•œ๊ตญ์ง€๋Šฅ์ •๋ณด์‚ฌํšŒ์ง„ํฅ์›(NIA)๊ณผ ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€ ์ฃผ๋„์˜ ๊ตญ๊ฐ€ ์‹ค์ฆ ํ”„๋กœ์ ํŠธ๋กœ, ์ •๋ถ€ ์ฐจ์›์˜ ์˜คํ”ˆ๋žœ ์ƒํƒœ๊ณ„ ์กฐ์„ฑ ์˜์ง€๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

"๊ธฐ์กด 5G ๋„คํŠธ์›Œํฌ์™€ ๋™๋“ฑํ•œ ์ˆ˜์ค€์˜ ์„œ๋น„์Šค"๋ฅผ ์ œ๊ณตํ•œ๋‹ค๊ณ  ๋ช…์‹œํ•œ ๊ฒƒ์€ ๊ตญ๋‚ด ์˜คํ”ˆ๋žœ ๊ธฐ์ˆ ์ด ๊ธ€๋กœ๋ฒŒ ์ˆ˜์ค€์— ๋„๋‹ฌํ–ˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ตญ๋‚ด ํ†ต์‹ ์žฅ๋น„ ์‚ฐ์—… ์ „๋ฐ˜์— ๊ธ์ •์  ์‹ ํ˜ธ์ด๋ฉฐ, CS์™€ ๊ฐ™์€ ๊ด€๋ จ ์—…์ฒด๋“ค์˜ ๊ธฐ์ˆ  ๊ณ ๋„ํ™” ํ•„์š”์„ฑ์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

4. ๋ฆฌ์Šคํฌ ์š”์ธ

  • CS์˜ ์˜คํ”ˆ๋žœ ํ”„๋กœ์ ํŠธ ์ง์ ‘ ์ฐธ์—ฌ ์—ฌ๋ถ€ ๋ถˆํ™•์‹ค
  • ์‚ผ์ง€์ „์žยท๋…ธํ‚ค์•„ ๋“ฑ ๊ฒฝ์Ÿ์‚ฌ ๋Œ€๋น„ ๊ธฐ์ˆ  ํฌ์ง€์…”๋‹ ๋ฏธํ™•์ธ
  • ์˜คํ”ˆ๋žœ ์ƒ์šฉํ™” ์ผ์ • ๋ฐ ์ˆ˜์ฃผ ๊ฐ€์‹œ์„ฑ ๋ถ€์กฑ

5. ํˆฌ์ž ์˜๊ฒฌ

์˜คํ”ˆ๋žœ ์‹œ์žฅ ํ™•๋Œ€๋Š” ์ค‘์žฅ๊ธฐ ๊ธ์ • ์š”์ธ์ด๋‚˜, CS์˜ ์ง์ ‘์  ์ˆ˜ํ˜œ ๊ฒฝ๋กœ๊ฐ€ ํ™•์ธ๋˜๊ธฐ ์ „๊นŒ์ง€ ๊ด€๋ง ํฌ์ง€์…˜์ด ์ ์ ˆํ•ฉ๋‹ˆ๋‹ค. ํ–ฅํ›„ CS์˜ ์˜คํ”ˆ๋žœ ๊ด€๋ จ ์ˆ˜์ฃผ ๊ณต์‹œ๋‚˜ ๊ธฐ์ˆ  ์ œํœด ๋ฐœํ‘œ ์‹œ ์žฌํ‰๊ฐ€๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.


Architecture

VELA๋Š” ๋‹จ๋… LLM์œผ๋กœ๋„ ๋™์ž‘ํ•˜์ง€๋งŒ, ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ๊ณผ ๊ฒฐํ•ฉํ•˜๋ฉด ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๋ถ„์„์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค:

[์‚ฌ์šฉ์ž ์ฟผ๋ฆฌ]
      |
      v
[๋ฉ€ํ‹ฐ์†Œ์Šค ๊ฒ€์ƒ‰] โ”€โ”€โ”€ DuckDuckGo (์ตœ์‹  ๋‰ด์Šค)
      |          โ”œโ”€โ”€ ํ•œ๊ตญํˆฌ์ž์ฆ๊ถŒ KIS (ํ˜„์žฌ๊ฐ€, PER, PBR, EPS, ์ˆ˜๊ธ‰)
      |          โ”œโ”€โ”€ FnGuide (์‚ฌ์—…๊ฐœ์š”, ์žฌ๋ฌด์ •๋ณด)
      |          โ””โ”€โ”€ FAISS 317K (๊ณผ๊ฑฐ ์œ ์‚ฌ ๋‰ด์Šค + ์ฃผ๊ฐ€ ๋ฐ˜์‘)
      v
[์ปจํ…์ŠคํŠธ ์ฃผ์ž…] โ†’ System Prompt + ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ
      |
      v
[VELA LLM] โ†’ Reasoning Trace (Step-by-Step)
      |
      v
[๋ฆฌ์„œ์น˜ ๋ฆฌํฌํŠธ]

Limitations

  • ์‹ค์‹œ๊ฐ„ ์‹œ์„ธ: ๋ชจ๋ธ ์ž์ฒด๋Š” ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ์— ์ ‘๊ทผํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค (์—์ด์ „ํŠธ ์‹œ์Šคํ…œ ํ•„์š”)
  • ์ˆ˜์น˜ ํ• ๋ฃจ์‹œ๋„ค์ด์…˜: ๊ตฌ์ฒด์  ์ˆ˜์น˜(์ฃผ๊ฐ€, PER ๋“ฑ)๋Š” ์™ธ๋ถ€ ๊ฒ€์ฆ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค
  • ์ปจํ…์ŠคํŠธ: 8K ํ† ํฐ ์ œํ•œ์œผ๋กœ ๊ธด ๋ฌธ์„œ ์ฒ˜๋ฆฌ์— ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค
  • ํˆฌ์ž ์กฐ์–ธ ์•„๋‹˜: ์ •๋ณด ์ œ๊ณต ๋ชฉ์ ์ด๋ฉฐ, ํˆฌ์ž ๊ฒฐ์ •์€ ๋ณธ์ธ์˜ ํŒ๋‹จ๊ณผ ์ฑ…์ž„์ž…๋‹ˆ๋‹ค

Citation

@misc{vela2026,
  title={VELA: Vector-Encoded Learning Agent for Korean Stock Analysis},
  author={intrect},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/intrect/VELA}
}

Disclaimer: ์ด ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์€ ํˆฌ์ž ์กฐ์–ธ์ด ์•„๋‹™๋‹ˆ๋‹ค. ๋ชจ๋“  ํˆฌ์ž ๊ฒฐ์ •์€ ๋ณธ์ธ์˜ ํŒ๋‹จ๊ณผ ์ฑ…์ž„ ํ•˜์— ์ด๋ฃจ์–ด์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค.