๐Ÿง  AlterEgo-373M - GGUF

GGUF builds of a 373M language model designed, trained, and served entirely from scratch.

Model Code Platform Params


GGUF quantizations of jbomdev/AlterEgo, a 373M-parameter decoder-only model built from the ground up: architecture, training, tokenizer, and inference all written from scratch. For the full story, including architecture, training curves, hyperparameters, and benchmarks, see the main model card.

Run it with Ollama (one command)

ollama run hf.co/jbomdev/AlterEgo-GGUF:Q8_0

Swap the tag for any quant in the table (:Q4_K_M, :F16). The ChatML template, stop tokens, and sampling defaults are applied automatically from the GGUF metadata and the params file in this repo.

Run it with llama.cpp

llama-cli -hf jbomdev/AlterEgo-GGUF:Q8_0 -p "Tell me about the ocean."

Quantizations

File Quant Size Notes
alterego-Q8_0.gguf Q8_0 ~0.4 GB Recommended. Near-lossless, still tiny.
alterego-Q4_K_M.gguf Q4_K_M ~0.25 GB Smallest. Some quality loss, more noticeable on a model this small.
alterego-F16.gguf F16 ~0.75 GB Full precision, max quality.

AlterEgo is small enough that Q8_0 (or even F16) runs comfortably on any laptop, and at this scale those preserve quality better than aggressive 4-bit quantization. Reach for Q4_K_M only if you want the smallest possible download.

Recommended generation settings

These are the defaults AlterEgo was tuned and served with in LLME:

Parameter Value
temperature 0.7
top_k 50
top_p 1.0
repeat_penalty 1.1

Chat format

AlterEgo uses ChatML, and stops on <|im_end|> or <|endoftext|>:

<|im_start|>system
{system prompt}<|im_end|>
<|im_start|>user
{message}<|im_end|>
<|im_start|>assistant

Limitations

A 373M model on a modest token budget behaves like one: it can be factually wrong, repeat itself, and lose coherence on long prompts. English only. Not safety- or preference-tuned. See the main model card for details.

License

Apache 2.0, same as the base model.

Downloads last month
59
GGUF
Model size
0.4B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jbomdev/AlterEgo-GGUF

Base model

jbomdev/AlterEgo
Quantized
(1)
this model