Instructions to use sscollab2/gemma3_checkpoint_step100 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use sscollab2/gemma3_checkpoint_step100 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-3-4b-it") model = PeftModel.from_pretrained(base_model, "sscollab2/gemma3_checkpoint_step100") - Notebooks
- Google Colab
- Kaggle
Gemma 3 Checkpoint Step 100
This repository contains a LoRA adapter checkpoint trained for google/gemma-3-4b-it.
Files
adapter_model.safetensors: LoRA adapter weightsadapter_config.json: PEFT adapter configuration
Serving with vLLM
This adapter can be served with vLLM by loading the Gemma 3 base model and enabling the LoRA module from this repository.
PORT=8071
GPU=0
MODEL_ID=google/gemma-3-4b-it
SERVED_MODEL_NAME=gemma3_with_reasoning
ADAPTER_REPO=sscollab2/gemma3_checkpoint_step100
CUDA_VISIBLE_DEVICES="$GPU" vllm serve "$MODEL_ID" \
--host 0.0.0.0 \
--port "$PORT" \
--tensor-parallel-size 1 \
--gpu-memory-utilization 0.90 \
--max-model-len 32768 \
--served-model-name gemma3_base \
--enable-lora \
--lora-modules "${SERVED_MODEL_NAME}=${ADAPTER_REPO}" \
--max-lora-rank 16 \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--limit-mm-per-prompt '{"image":10,"audio":0}'
Once the server is ready, call the LoRA-served model name:
curl http://127.0.0.1:8071/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemma3_with_reasoning",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
For the local serving script this was based on, see:
/local3/elaine1wan/SS_inference/SS_inference_0507/gemma3_scripts/run_serve_gemma3_checkpoint.sh
- Downloads last month
- 24
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support