Instructions to use LesterCerioli/LLM-GO with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LesterCerioli/LLM-GO with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LesterCerioli/LLM-GO")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("LesterCerioli/LLM-GO", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use LesterCerioli/LLM-GO with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LesterCerioli/LLM-GO" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LesterCerioli/LLM-GO", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/LesterCerioli/LLM-GO
- SGLang
How to use LesterCerioli/LLM-GO with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LesterCerioli/LLM-GO" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LesterCerioli/LLM-GO", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LesterCerioli/LLM-GO" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LesterCerioli/LLM-GO", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use LesterCerioli/LLM-GO with Docker Model Runner:
docker model run hf.co/LesterCerioli/LLM-GO
LLM-GO
A Go-specialized large language model built with TensorFlow 2 and Python 3.12. Trained on all Golang versions (1.0β1.24), the Fiber and Cobra ecosystems, real-world project patterns, and Go best practices. Published to Hugging Face as an open-source model under the Apache 2.0 license.
Table of Contents
- Overview
- Architecture
- Model Sizes
- Training Data
- Project Structure
- Requirements
- Quick Start
- Pipeline
- Go Layout Rule
- Supported Frameworks
- Configuration
- Evaluation
- Deploying to Hugging Face
- Development
- License
Overview
llm-go is a decoder-only transformer model designed exclusively for Go code generation, completion, and explanation. It understands Go idioms, project layout conventions, the standard library across all major versions, and the most widely used frameworks in the Go ecosystem.
Key goals:
- Complete coverage of Go 1.0 through 1.24
- Deep knowledge of Fiber, Cobra, GORM, Gin, Echo, gRPC, and more
- Enforces canonical Go project layout (
cmd/always at the repo root) - Trained on real-world patterns extracted from production Go projects
- Fully open-source and deployable via the Hugging Face Hub
Architecture
GoLLM is a GPT-style decoder-only transformer with modern improvements from LLaMA/Mistral:
| Component | Implementation |
|---|---|
| Attention | Multi-head causal self-attention |
| Positional encoding | RoPE (Rotary Position Embedding) |
| Normalization | RMSNorm (pre-norm, before each sub-layer) |
| Feed-forward | SwiGLU activation (silu(gate(x)) * up(x)) |
| Embeddings | Tied input/output embeddings |
| Tokenizer | BPE via HuggingFace tokenizers (Rust-backed) |
| Training precision | bfloat16 mixed precision |
| Multi-GPU | TensorFlow MirroredStrategy |
| Optimizer | AdamW + cosine LR schedule with warmup |
Special Tokens
The tokenizer uses structural tags so the model understands Go file anatomy:
<go_file> <go_func> <go_type> <go_pkg> <go_version>
<go_test> <go_comment>
<task:generate> <task:complete> <task:fix> <task:explain> <task:optimize>
Model Sizes
| Variant | Parameters | d_model | Layers | Heads | Context | Use case |
|---|---|---|---|---|---|---|
small |
~125 M | 768 | 12 | 12 | 2 048 | CPU / fast iteration |
medium |
~350 M | 1 024 | 24 | 16 | 2 048 | Single GPU (default) |
large |
~760 M | 1 280 | 36 | 20 | 4 096 | Multi-GPU |
xl |
~1.5 B | 1 600 | 48 | 25 | 4 096 | Near state-of-the-art |
The default training target is medium. Override with MODEL_SIZE=large make train.
Training Data
Real-world corpus
- Up to 50 000 Go repositories from GitHub (β₯10 stars)
- Go standard library source across all versions (1.0β1.24)
- Official documentation and release notes
Synthetic patterns (oversampled)
Patterns extracted from real production Go projects and rendered across multiple Go versions, business domains, and application types:
| Category | Examples | Source |
|---|---|---|
| Fiber controllers | ~36 | Struct-based handlers, constructor injection, Swagger |
| GORM repositories | ~52 | UUID PKs, soft delete, repo interface pattern |
| Service layer | ~32 | errgroup, DI container, RabbitMQ consumer |
| JWT / Auth | ~16 | HS256, bcrypt, Bearer middleware, CPF/CNPJ validators |
| Tests | ~20 | go-sqlmock, testify, fiber.App.Test(), table-driven |
| Docker / CI | ~40 | Multi-stage Dockerfile, docker-compose, Jenkinsfile |
| Total | ~196 |
Layout examples are oversampled 5Γ and pattern examples 3Γ to reinforce correct conventions.
Deduplication
MinHash LSH with 128 permutations, 32 bands, and a 0.80 Jaccard similarity threshold removes near-duplicate files before tokenization.
Dataset format
Preprocessed data is stored as sharded TFRecord files in data/processed/{train,val,test}/.
Project Structure
llm-go/
βββ cmd/ # (Go convention β always at root)
βββ configs/
β βββ small.yaml
β βββ medium.yaml
β βββ large.yaml
βββ data/
β βββ raw/ # downloaded Go source files
β βββ processed/ # TFRecord shards
β βββ tokenizer/ # trained BPE tokenizer
βββ scripts/
β βββ setup_env.sh
β βββ collect_data.sh
β βββ build_tokenizer.sh
β βββ preprocess.sh
β βββ train.sh
β βββ evaluate.sh
β βββ generate.sh
β βββ deploy_huggingface.sh
βββ src/llm_go/
β βββ config.py # ModelConfig, TrainingConfig, DataConfig
β βββ model/
β β βββ attention.py # RoPE + MultiHeadAttention
β β βββ transformer.py # RMSNorm, SwiGLU, TransformerBlock, GoLLM
β βββ tokenizer/
β β βββ go_tokenizer.py # BPE + structural tag injection
β βββ data/
β β βββ collector.py # GitHub + stdlib scraper
β β βββ preprocessor.py # filter β dedup β tokenize β TFRecord
β β βββ go_best_practices.py # GoProjectTemplates + GoLayoutValidator
β β βββ templates/
β β β βββ loader.py
β β β βββ go_project/ # canonical cmd/ layout examples
β β βββ patterns/
β β βββ fiber_patterns.py
β β βββ gorm_patterns.py
β β βββ service_patterns.py
β β βββ auth_patterns.py
β β βββ test_patterns.py
β β βββ docker_patterns.py
β β βββ registry.py # PatternRegistry (~196 examples)
β βββ training/
β β βββ trainer.py # gradient accumulation, MirroredStrategy
β β βββ lr_schedule.py # CosineWithWarmup
β βββ evaluation/
β β βββ metrics.py # perplexity, pass@k, gofmt rate, BLEU, ROUGE-L
β βββ deployment/
β β βββ hf_uploader.py # safetensors + model card β HF Hub
β βββ scripts/ # CLI entry points
β βββ collect.py
β βββ tokenize.py
β βββ train.py
β βββ evaluate.py
β βββ generate.py
β βββ deploy.py
βββ tests/
β βββ conftest.py # shared pytest fixtures
β βββ test_model.py
β βββ test_tokenizer.py
β βββ test_best_practices.py
βββ checkpoints/ # saved during training
βββ logs/ # TensorBoard event files
βββ Makefile
βββ pyproject.toml
βββ requirements.txt
βββ requirements-gpu.txt
Requirements
- Python 3.12
- TensorFlow 2.17.1 (CPU) or
tensorflow[and-cuda]for GPU - CUDA 12.x + cuDNN 8.x (optional, GPU only)
Python 3.12 compatibility notes
| Package | Version | Note |
|---|---|---|
tensorflow |
2.17.1 | cp312 wheel confirmed (manylinux) |
keras |
3.5.0 | compatible with TF 2.17.x |
numpy |
1.26.4 | TF 2.17.x requires numpy < 2 |
tensorboard |
2.17.1 | must match TF version |
tensorflow-text |
β | skipped 2.17.x release; not used (tokenization via HF tokenizers) |
tree-sitter |
optional | core pipeline uses regex tagging; see requirements.txt comments |
Quick Start
1. Clone and install
git clone https://github.com/your-org/llm-go.git
cd llm-go
# CPU
bash scripts/setup_env.sh
# GPU (NVIDIA CUDA 12)
bash scripts/setup_env.sh --gpu
Or manually:
python3.12 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -e ".[dev]"
2. Generate code (from a pre-trained checkpoint)
# using the Makefile
make generate
# or directly
llm-go-generate \
--model-dir checkpoints/final \
--tok-dir data/tokenizer \
--prompt "package main\n\nimport \"github.com/gofiber/fiber/v2\"\n\nfunc main() {"
3. Generate with a Python script
from llm_go.model.transformer import GoLLM
from llm_go.tokenizer.go_tokenizer import GoTokenizer
tok = GoTokenizer.load("data/tokenizer")
model = GoLLM.from_pretrained("checkpoints/final")
prompt = """<go_version>1.24</go_version>
<go_file>cmd/server/main.go</go_file>
package main
import "github.com/gofiber/fiber/v2"
func main() {"""
ids = tok.encode(prompt)
output = model.generate(ids, max_new_tokens=256, temperature=0.8, top_p=0.95)
print(tok.decode(output))
Pipeline
Run each stage individually or all at once with make pipeline.
Stage 1 β Collect data
export GITHUB_TOKEN=ghp_...
make collect
# or
bash scripts/collect_data.sh
Downloads Go repositories (β₯10 stars, configurable) and the standard library into data/raw/.
Stage 2 β Build tokenizer
make tokenize
# or
bash scripts/build_tokenizer.sh
Trains a 32 000-token BPE vocabulary on the raw corpus with Go keywords, builtins, and packages seeded as the initial alphabet.
Stage 3 β Preprocess
make preprocess
# or
bash scripts/preprocess.sh
Applies quality filtering β MinHash LSH deduplication β PII scrubbing β tokenization β sequence packing β TFRecord sharding.
Synthetic layout and pattern examples are prepended and oversampled before the real data.
Stage 4 β Train
# Default: medium model, bfloat16, all available GPUs
make train
# Choose size
make train-small
make train-large
MODEL_SIZE=xl make train
# Custom
MODEL_SIZE=medium BATCH_SIZE=64 MAX_STEPS=200000 bash scripts/train.sh
Training uses XLA JIT compilation, gradient accumulation (default 4 steps), and TensorFlow MirroredStrategy for multi-GPU.
Monitor with TensorBoard:
make tb
# opens http://localhost:6006
Stage 5 β Evaluate
make evaluate
# or
bash scripts/evaluate.sh
Reports perplexity, pass@k (unbiased estimator), gofmt syntax pass rate, BLEU, and ROUGE-L.
Stage 6 β Deploy to Hugging Face
export HF_TOKEN=hf_...
export HF_REPO_ID=your-org/llm-go-350m
make deploy
# or
bash scripts/deploy_huggingface.sh
Converts Keras weights to SafeTensors format, uploads the tokenizer as PreTrainedTokenizerFast, and generates a model card automatically.
Go Layout Rule
One of the core conventions this model learns and enforces:
cmd/is always at the project root. Each binary lives in its own subdirectory with amain.go.
my-project/ β project root
βββ cmd/
β βββ server/
β β βββ main.go β binary: server
β βββ worker/
β β βββ main.go β binary: background worker
β βββ cli/
β βββ main.go β binary: CLI tool
βββ internal/
β βββ config/
β βββ handler/
β βββ service/
βββ go.mod
βββ go.sum
main.go only wires dependencies. All business logic lives in internal/. The cmd/ directory is never nested inside internal/, pkg/, or any other subdirectory.
The GoLayoutValidator class enforces this during data collection: files from repositories with a nested or missing cmd/ receive a lower training weight.
Supported Frameworks
GoLLM is trained on idiomatic usage of the following libraries:
| Framework | Purpose |
|---|---|
github.com/gofiber/fiber/v2 |
HTTP server (primary) |
github.com/spf13/cobra |
CLI applications |
github.com/spf13/viper |
Configuration |
gorm.io/gorm |
ORM + PostgreSQL |
github.com/gin-gonic/gin |
HTTP server (alternative) |
github.com/labstack/echo |
HTTP server (alternative) |
github.com/go-chi/chi |
Lightweight HTTP router |
google.golang.org/grpc |
gRPC services |
github.com/stretchr/testify |
Testing assertions |
go.uber.org/zap |
Structured logging |
github.com/golang-jwt/jwt |
JWT authentication |
golang.org/x/crypto/bcrypt |
Password hashing |
github.com/rabbitmq/amqp091-go |
RabbitMQ messaging |
github.com/redis/go-redis/v9 |
Redis client |
github.com/prometheus/client_golang |
Metrics |
github.com/DATA-DOG/go-sqlmock |
SQL mocking in tests |
Configuration
Training parameters can be set via environment variables, YAML configs, or Makefile overrides.
# Environment variables (all optional β defaults shown)
MODEL_SIZE=medium # small | medium | large | xl
BATCH_SIZE=32
MAX_STEPS=100000
WARMUP_STEPS=2000
GRAD_ACCUM=4
PRECISION=bfloat16 # float32 | float16 | bfloat16
GPUS=-1 # -1 = all GPUs, 0 = GPU 0 only
CKPT_DIR=checkpoints
LOG_DIR=logs
YAML configs for each size are in configs/:
# train from a YAML config
llm-go-train --config configs/large.yaml
Evaluation
Metrics computed by GoCodeEvaluator:
| Metric | Description |
|---|---|
| Perplexity | Cross-entropy exponentiated on the validation split |
| pass@k | Unbiased estimator of functional correctness (k=1,10,100) |
| gofmt pass rate | % of generated files that parse and format without error |
| BLEU | n-gram overlap vs. reference completions |
| ROUGE-L | Longest-common-subsequence F1 vs. references |
Deploying to Hugging Face
The uploader (HuggingFaceUploader) handles everything:
- Converts Keras weights β SafeTensors
- Writes
config.jsonin GPT-2-compatible format - Uploads
PreTrainedTokenizerFast(usable withtransformers) - Generates a model card with usage examples
- Optionally creates a Gradio demo space
export HF_TOKEN=hf_...
export HF_REPO_ID=your-org/llm-go-350m
llm-go-deploy \
--ckpt-dir checkpoints/final \
--tok-dir data/tokenizer \
--repo-id "$HF_REPO_ID" \
--token "$HF_TOKEN" \
--public
Once uploaded, use the model from any Python environment:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("your-org/llm-go-350m")
model = AutoModelForCausalLM.from_pretrained("your-org/llm-go-350m")
inputs = tokenizer("package main\n\nfunc main() {", return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(output[0]))
Development
Run tests
make test
# or
pytest tests/ -v --cov=llm_go --cov-report=term-missing
Lint and format
make lint # ruff + mypy
make fmt # black + ruff --fix
Pre-commit hooks
pre-commit install
GPU setup (NVIDIA)
pip install -r requirements-gpu.txt
# verify
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
License
Apache 2.0 β see LICENSE.
Patterns derived from real-world Go projects are used for educational and model-training purposes only. All generated code is original output of the model.