Instructions to use Marcoson320/codeparrot-gpt2-mi50-eos-ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Marcoson320/codeparrot-gpt2-mi50-eos-ft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Marcoson320/codeparrot-gpt2-mi50-eos-ft")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Marcoson320/codeparrot-gpt2-mi50-eos-ft")
model = AutoModelForMultimodalLM.from_pretrained("Marcoson320/codeparrot-gpt2-mi50-eos-ft")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Marcoson320/codeparrot-gpt2-mi50-eos-ft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Marcoson320/codeparrot-gpt2-mi50-eos-ft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Marcoson320/codeparrot-gpt2-mi50-eos-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Marcoson320/codeparrot-gpt2-mi50-eos-ft

SGLang

How to use Marcoson320/codeparrot-gpt2-mi50-eos-ft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Marcoson320/codeparrot-gpt2-mi50-eos-ft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Marcoson320/codeparrot-gpt2-mi50-eos-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Marcoson320/codeparrot-gpt2-mi50-eos-ft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Marcoson320/codeparrot-gpt2-mi50-eos-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Marcoson320/codeparrot-gpt2-mi50-eos-ft with Docker Model Runner:
```
docker model run hf.co/Marcoson320/codeparrot-gpt2-mi50-eos-ft
```

codeparrot-gpt2-mi50-eos-ft

A fine-tune of Marcoson320/codeparrot-gpt2-mi50 experimenting with adding end-of-sequence (<|endoftext|>) emission behavior to a code-completion model trained without document-boundary EOS tokens.

This is a feasibility study — partial success (3/5 on a small probe set). Published for transparency about the method and limitations.

Motivation

The base model was trained following HuggingFace LLM Course Chapter 7.6, whose tokenize function does not insert <|endoftext|> at document boundaries:

def tokenize(element):
    outputs = tokenizer(
        element["content"], truncation=True,
        max_length=128, return_overflowing_tokens=True, ...
    )
    # no EOS inserted between documents

GPT-2 paper and karpathy/nanoGPT both insert an end-of-text token between documents; the course's simplification produces a model that does not stop generating at natural boundaries.

This fine-tune attempts to retrofit the EOS signal post-hoc.

Fine-tune Configuration

Item	Value
Base	`Marcoson320/codeparrot-gpt2-mi50` (final checkpoint)
Optimizer	AdamW (β₁=0.9, β₂=0.999, weight_decay=0.1)
Learning rate	5×10⁻⁵, cosine schedule, 100 warmup steps
Effective batch size	256 (per_device_bs=32 × grad_accum=4 × world_size=2)
Steps	4,000
Precision	fp16
Parallelism	DistributedDataParallel on 2 × MI50
Wall clock	~1h 44m
final train_loss	1.162
final eval_loss	1.569

Data preparation

Each training sample is a variable-length token slice (32–120 tokens) of a Python file, with <|endoftext|> (id 0) appended explicitly, then padded to 128 with a label mask of -100 (so padding does not contribute to loss).

This raises EOS signal density from < 0.04% (base training) to ~3%.

Results

Tested on five self-contained prompts, comparing base vs fine-tuned model. EOS emission within 80 generated tokens, greedy decoding, repetition_penalty=1.12, no_repeat_ngram_size=4:

Prompt	Base	EOS-FT
`def add(a, b):\n return a + b\n`	✗	✓ (pos 13)
`def square(x):\n return x * x\n\n`	✗	✗
`def greet(name):\n print(f'Hello {name}')\n\n`	✗	✓ (pos 47)
`import os\nprint(os.getcwd())\n`	✗	✗
`x = 1\ny = 2\nz = x + y\n`	✗	✓ (pos 61)
Total emit rate	0/5	3/5

Known limitations

Partial success: 2 of 5 prompts still do not stop within 80 tokens.
EOS position quality is imperfect: the model sometimes emits EOS mid-expression rather than at a clean function/statement boundary. This is attributable to the data preparation — random-length chunks rather than AST-based semantic units. A more rigorous approach would slice each file into complete FunctionDef / ClassDef blocks via ast.parse so the model only sees EOS at structural endpoints.
Code quality of pre-EOS content inherits the base model's small-scale artifacts (occasional Jupyter notebook markers, partial idioms).

Usage

from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="Marcoson320/codeparrot-gpt2-mi50-eos-ft",
    device=0,
)

out = pipe(
    "def add(a, b):\n    return a + b\n",
    max_new_tokens=80,
    do_sample=False,
    repetition_penalty=1.12,
    no_repeat_ngram_size=4,
)
print(out[0]["generated_text"])

Reproduction

Training script and full method documentation: bundle on the project HTTP server (LAN only). Source code mirrored alongside this model is the published train_eos_v2.py and test_eos.py.

Base model: Marcoson320/codeparrot-gpt2-mi50
Course reference: HuggingFace LLM Course Chapter 7.6

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Marcoson320/codeparrot-gpt2-mi50-eos-ft

Base model

Marcoson320/codeparrot-gpt2-mi50

Finetuned

(1)

this model

Marcoson320
/

codeparrot-gpt2-mi50-eos-ft