Instructions to use pathcosmos/frankenstallm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use pathcosmos/frankenstallm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="pathcosmos/frankenstallm")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("pathcosmos/frankenstallm")
model = AutoModelForCausalLM.from_pretrained("pathcosmos/frankenstallm")

llama-cpp-python

How to use pathcosmos/frankenstallm with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="pathcosmos/frankenstallm",
	filename="gguf/frankenstallm-3b-Q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use pathcosmos/frankenstallm with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf pathcosmos/frankenstallm:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf pathcosmos/frankenstallm:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf pathcosmos/frankenstallm:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf pathcosmos/frankenstallm:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf pathcosmos/frankenstallm:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf pathcosmos/frankenstallm:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf pathcosmos/frankenstallm:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf pathcosmos/frankenstallm:Q4_K_M

Use Docker

docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M

LM Studio
Jan

vLLM

How to use pathcosmos/frankenstallm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "pathcosmos/frankenstallm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pathcosmos/frankenstallm",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M

SGLang

How to use pathcosmos/frankenstallm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "pathcosmos/frankenstallm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pathcosmos/frankenstallm",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "pathcosmos/frankenstallm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pathcosmos/frankenstallm",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Ollama
How to use pathcosmos/frankenstallm with Ollama:
```
ollama run hf.co/pathcosmos/frankenstallm:Q4_K_M
```

Unsloth Studio new

How to use pathcosmos/frankenstallm with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for pathcosmos/frankenstallm to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for pathcosmos/frankenstallm to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for pathcosmos/frankenstallm to start chatting

Docker Model Runner
How to use pathcosmos/frankenstallm with Docker Model Runner:
```
docker model run hf.co/pathcosmos/frankenstallm:Q4_K_M
```

Lemonade

How to use pathcosmos/frankenstallm with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull pathcosmos/frankenstallm:Q4_K_M

Run and chat with the model

lemonade run user.frankenstallm-Q4_K_M

List all available models

lemonade list

frankenstallm / source /eval /data_inventory /gap_analysis.md

pathcosmos

Upload folder using huggingface_hub (#29)

5b1ff4d 2 months ago

preview code

raw

history blame

4.92 kB

데이터 갭 분석 보고서

생성일: 2026-02-27 | 모델: 3B parameter LLM

1. 현재 데이터 인벤토리

1.1 Pretrain 데이터 (토큰화 완료 .bin)

파일	크기	토큰 수 (uint16)
korean_train.bin	17GB	8.9B
korean_c4_train.bin	15GB	7.56B
korean_namuwiki_train.bin	2.1GB	1.08B
korean_wiki_train.bin	500MB	0.26B
train.bin (영어)	1.2GB	0.60B
합계 (토큰화 완료)		~18.4B tokens

⚠️ korean_train.bin은 c4+namuwiki+wiki의 머지본일 가능성 높음 → 실제 고유 토큰은 ~9B 수준

1.2 미토큰화 원시 데이터 (korean_extra/)

소스	디스크 크기	추정 토큰 수	품질 등급
CulturaX ko	60GB	~15B	B+
HPLT ko	23GB	~5B	B
cc100 ko	14GB	~3.5B	C+
OSCAR ko	9.2GB	~2.3B	B
korean_textbooks	6.4GB	~1.5B	A
korean_webtext	4.2GB	~1B	B+
finepdfs_edu_ko	2.9GB	~0.7B	A-
namuwiki_extracted	2.2GB	~0.5B	A-
wikipedia_korean	1.7GB	~0.4B	A
kovast	449MB	~0.1B	B
소계	~124GB	~30B

1.3 SFT 데이터

train.jsonl: 161,848 샘플 (276MB)
val.jsonl: 8,518 샘플 (15MB)
소스: evol_instruct_ko, korean_safe_conv 등

1.4 Preference 데이터

현재 보유: 0 ❌

총합

단계	보유량
Pretrain (토큰화)	~9B tokens
Pretrain (미처리)	~30B tokens
Pretrain 합계	~39B tokens
SFT	170K 샘플
Preference	0

2. 3B 모델 학습 요구량 vs 현재

2.1 Pretrain

기준	필요 토큰	현재	갭	상태
Chinchilla optimal (×70)	210B	39B	-171B	🔴 심각 부족
Chinchilla minimum (×20)	60B	39B	-21B	🟡 부족
LLaMA-style (×33)	100B	39B	-61B	🔴 부족
실용적 목표	60~80B	39B	-21~41B	🟡

결론: 최소 기준(60B)에도 21B tokens 부족. 현실적으로 60~~80B 타겟 시 추가 21~~41B 필요.

2.2 SFT

기준	필요량	현재	갭	상태
최소 고품질	50K	170K	충분	🟢
업계 표준	100~200K	170K	충분	🟢
도메인 다양성	다양한 태스크	제한적	보완 필요	🟡

결론: 양적으로 충분하나 도메인 커버리지(수학, 코드, 추론) 보강 필요.

2.3 Preference (ORPO/DPO)

기준	필요량	현재	갭	상태
최소	5K 쌍	0	-5K	🔴
적정	20~60K 쌍	0	-60K	🔴

결론: 심각한 갭. ORPO/DPO 학습 자체가 불가능.

3. 경쟁 모델 대비 포지셔닝

모델	파라미터	Pretrain 토큰	우리 대비
Polyglot-Ko 12.8B	12.8B	1.2T	30×
EXAONE 3.0	7.8B	8T	200×
HyperCLOVA X	비공개	수백B~수T	10~100×
Phi-3 mini 3.8B	3.8B	3.3T	85×
StableLM 3B	3B	4T	100×
우리 (목표)	3B	60~80B	기준

분석:

우리 60~~80B은 모델 크기 대비 Chinchilla minimum~~적정 수준
대형 모델들은 10~~100× 많은 데이터 사용하지만, 모델도 2~~40× 큼
3B에 60B tokens은 합리적 최소치 — 학계에서 3B급은 50~100B에서 좋은 결과
품질 필터링 + 커리큘럼 학습으로 효율 보완 가능

4. 데이터 품질 분석

현재 품질 분포 (추정 토큰 기준)

A등급 (고품질):   ~3.0B (8%)  - wiki, textbooks, finepdfs_edu
B등급 (양호):    ~24B  (61%)  - CulturaX, OSCAR, HPLT, webtext
C등급 (노이즈):   ~12B (31%)  - cc100, 기타 웹 크롤링

문제점:

고품질(A급) 비중이 8%로 매우 낮음
코드/수학/과학 데이터 전무
영어 데이터 비중 극히 적음 (0.6B) — 다국어 능력 부족

5. 핵심 결론

현재 데이터로 3B 학습 충분한가?

No — 다음 이유로 불충분:

Pretrain 토큰 부족 (39B vs 최소 60B, 21B 갭)
Preference 데이터 부재 (ORPO 학습 불가)
코드/수학 데이터 전무 (범용 능력 제한)
고품질 비율 낮음 (8%)
영어 데이터 부족 (cross-lingual transfer 제한)

부족한 데이터 유형 요약

유형	심각도	필요 조치
Pretrain 토큰	🟡 중간	+21~41B 토큰 확보
코드 데이터	🔴 심각	코드 코퍼스 추가 (5~10B)
수학/과학	🔴 심각	전문 코퍼스 추가 (2~5B)
영어 데이터	🟡 중간	고품질 영어 10~20B 추가
Preference	🔴 심각	20K+ 쌍 확보
SFT 다양성	🟡 중간	코드/수학/추론 SFT 추가