Instructions to use Marcoson320/codeparrot-gpt2-mi50 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Marcoson320/codeparrot-gpt2-mi50 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Marcoson320/codeparrot-gpt2-mi50")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Marcoson320/codeparrot-gpt2-mi50")
model = AutoModelForMultimodalLM.from_pretrained("Marcoson320/codeparrot-gpt2-mi50")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Marcoson320/codeparrot-gpt2-mi50 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Marcoson320/codeparrot-gpt2-mi50"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Marcoson320/codeparrot-gpt2-mi50",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Marcoson320/codeparrot-gpt2-mi50

SGLang

How to use Marcoson320/codeparrot-gpt2-mi50 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Marcoson320/codeparrot-gpt2-mi50" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Marcoson320/codeparrot-gpt2-mi50",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Marcoson320/codeparrot-gpt2-mi50" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Marcoson320/codeparrot-gpt2-mi50",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Marcoson320/codeparrot-gpt2-mi50 with Docker Model Runner:
```
docker model run hf.co/Marcoson320/codeparrot-gpt2-mi50
```

codeparrot-gpt2-mi50

GPT-2 small (124M 參數) 之 causal language model，依 HuggingFace LLM Course Chapter 7.6 所述方法訓練。權重自隨機初始化開始，於 codeparrot-ds 資料集上訓練 1 epoch，作為 from-scratch 訓練流程在 AMD MI50 平台之可重現紀錄。

訓練配置

項目	值
模型架構	GPT-2 (n_layer=12, n_head=12, n_embd=768)
參數量	124,242,432
Tokenizer	huggingface-course/code-search-net-tokenizer (BPE, vocab=50,000)
Context length	128 tokens
訓練集	huggingface-course/codeparrot-ds-train (16,702,061 length-128 chunks)
驗證集	huggingface-course/codeparrot-ds-valid
Optimizer	AdamW (β₁=0.9, β₂=0.999, weight_decay=0.1)
Learning rate	5×10⁻⁴, cosine schedule, warmup 1,000 steps
Effective batch size	256 (per_device_bs=64 × grad_accum=2 × world_size=2)
Precision	fp16
平行化	DistributedDataParallel (DDP), NCCL/RCCL backend
總步數	65,243 (1 epoch)
Eval / save 間隔	每 5,000 steps

硬體環境

GPU：2 × AMD Radeon Instinct MI50 (32 GB HBM2 each, gfx906)
平台：PyTorch + ROCm，容器化部署
訓練時間：約 19 小時
平均 throughput：159.8 samples/sec, ~1.41 sec/step

Loss 與訓練動力學

每 5,000 steps 取一個訓練 metrics 與 eval metrics 紀錄。

step	epoch	learning_rate	train_loss	grad_norm	eval_loss
5,000	0.077	4.952×10⁻⁴	2.677	0.180	1.752
10,000	0.153	4.762×10⁻⁴	1.685	0.152	1.520
15,000	0.230	4.437×10⁻⁴	1.529	0.153	1.415
20,000	0.307	3.996×10⁻⁴	1.447	0.145	1.347
25,000	0.383	3.467×10⁻⁴	1.386	0.154	1.295
30,000	0.460	2.880×10⁻⁴	1.334	0.160	1.247
35,000	0.537	2.271×10⁻⁴	1.288	0.160	1.204
40,000	0.613	1.675×10⁻⁴	1.241	0.170	1.160
45,000	0.690	1.128×10⁻⁴	1.200	0.175	1.123
50,000	0.766	6.631×10⁻⁵	1.162	0.174	1.090
55,000	0.843	3.072×10⁻⁵	1.135	0.180	1.066
60,000	0.920	8.175×10⁻⁶	1.113	0.191	1.054
65,000	0.996	1.78×10⁻⁸	1.106	0.180	1.051

訓練未進行更多 epoch 或超參數搜尋。後半段 cosine 衰減使 lr 趨近於零，gradient norm 維持在 0.15-0.19 區間，未出現發散或不穩定徵兆。

使用

from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="Marcoson320/codeparrot-gpt2-mi50",
    device=0,
)

print(pipe("# scatter plot of x, y\n", max_new_tokens=64)[0]["generated_text"])

限制

Context 上限 128 tokens，無法處理較長之程式碼段落。
訓練資料偏重 pandas / sklearn / matplotlib / seaborn 之 GitHub Python，其他領域之程式碼覆蓋有限。
模型容量小，續寫易出現 repetition；推論時可設 repetition_penalty>1.0 或 no_repeat_ngram_size 緩解。

Downloads last month: 34

Safetensors

Model size

0.1B params

Tensor type

F32

Marcoson320
/

codeparrot-gpt2-mi50

codeparrot-gpt2-mi50

訓練配置

硬體環境

Loss 與訓練動力學

使用

限制

Dataset used to train Marcoson320/codeparrot-gpt2-mi50