Instructions to use felipecmarins/gemma3-4b-algebra4-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use felipecmarins/gemma3-4b-algebra4-merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="felipecmarins/gemma3-4b-algebra4-merged")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("felipecmarins/gemma3-4b-algebra4-merged")
model = AutoModelForImageTextToText.from_pretrained("felipecmarins/gemma3-4b-algebra4-merged")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use felipecmarins/gemma3-4b-algebra4-merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "felipecmarins/gemma3-4b-algebra4-merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "felipecmarins/gemma3-4b-algebra4-merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/felipecmarins/gemma3-4b-algebra4-merged

SGLang

How to use felipecmarins/gemma3-4b-algebra4-merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "felipecmarins/gemma3-4b-algebra4-merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "felipecmarins/gemma3-4b-algebra4-merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "felipecmarins/gemma3-4b-algebra4-merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "felipecmarins/gemma3-4b-algebra4-merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use felipecmarins/gemma3-4b-algebra4-merged with Docker Model Runner:
```
docker model run hf.co/felipecmarins/gemma3-4b-algebra4-merged
```

gemma3-4b-algebra4-merged — Gemma 3 4B fine-tuned em Álgebra Linear

Modelo merged (base + LoRA) em BF16. Para uso com transformers, fine-tune adicional, ou conversão para outros formatos. Para uso direto em celular, prefira o gemma3-4b-algebra4-gguf.

O que é

Resultado de aplicar peft.merge_and_unload() sobre google/gemma-3-4b-it + o adapter LoRA treinado no algebra4-mix. Os pesos estão em BF16, prontos para inferência via transformers ou para serem requantizados em outro formato (AWQ, GPTQ, EXL2, GGUF, LiteRT).

Uso com transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

mid = "felipecmarins/gemma3-4b-algebra4-merged"
tok = AutoTokenizer.from_pretrained(mid)
model = AutoModelForCausalLM.from_pretrained(mid, torch_dtype=torch.bfloat16, device_map="auto")

prompt = tok.apply_chat_template(
    [{"role": "user", "content": "Encontre os autovalores de A = [[2,1],[1,2]]."}],
    tokenize=False, add_generation_prompt=True,
)
ids = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**ids, max_new_tokens=400, do_sample=False)
print(tok.decode(out[0][ids.input_ids.shape[1]:], skip_special_tokens=True))

Como foi produzido

Etapa	Comando · resultado
1. Treino QLoRA	`transformers 5.9 + peft 0.19 + trl 1.4 + bnb 0.49` em L4 24 GB; NF4 + LoRA r=16
2. Merge LoRA	`peft.merge_and_unload()` em CPU (`device_map="cpu"`, `low_cpu_mem_usage=True`)
3. Save	`save_pretrained(..., safe_serialization=True, max_shard_size="5GB")` → 2 shards

Tamanho final em disco: 8.1 GB (model-00001-of-00002 = 4.7 GB · model-00002-of-00002 = 3.4 GB).

Treinamento — sumário

Mix: 50 k samples (de 495 k disponíveis) do algebra4-mix
Hiperparâmetros: lr=1e-4 cosine · warmup 3 % · batch 1 × grad acc 32 · max_seq 1024 · 1 epoch
Hardware: NVIDIA L4 24 GB em Mumbai (asia-south1-b) — única zona com estoque L4 no momento por demanda global
Duração: ~14 h · train_loss = 0.74 final
Custo: ~US$ 12 (créditos GCP gratuitos)

Componentes relacionados

Repo	Conteúdo
`felipecmarins/gemma3-4b-algebra4-merged`	Este — BF16 merged
`felipecmarins/gemma3-4b-algebra4-lora`	Apenas adapters (65 MB) — para quem prefere aplicar dinâmico em runtime
`felipecmarins/gemma3-4b-algebra4-gguf`	Q4_0 (2.3 GB) — para celular / mobile

Licença

Gemma Terms of Use.

Downloads last month: 24

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for felipecmarins/gemma3-4b-algebra4-merged

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

(692)

this model