Instructions to use beaglabs/ent-smolm-entity-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries
PEFT
How to use beaglabs/ent-smolm-entity-v2 with PEFT:
```
Task type is invalid.
```

How to use beaglabs/ent-smolm-entity-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="beaglabs/ent-smolm-entity-v2")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("beaglabs/ent-smolm-entity-v2", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use beaglabs/ent-smolm-entity-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "beaglabs/ent-smolm-entity-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "beaglabs/ent-smolm-entity-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/beaglabs/ent-smolm-entity-v2

SGLang

How to use beaglabs/ent-smolm-entity-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "beaglabs/ent-smolm-entity-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "beaglabs/ent-smolm-entity-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "beaglabs/ent-smolm-entity-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "beaglabs/ent-smolm-entity-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use beaglabs/ent-smolm-entity-v2 with Docker Model Runner:
```
docker model run hf.co/beaglabs/ent-smolm-entity-v2
```

.ent SmolLM Entity v2

.ent SmolLM Entity v2 is an experimental entity-conditioned language model built on top of HuggingFaceTB/SmolLM2-1.7B-Instruct.

This repository does not contain a full standalone Hugging Face model checkpoint. It contains the pieces needed by the .ent architecture:

a LoRA adapter for the SmolLM2 base model
an entity_proj.safetensors projection layer
an entity_decoder.safetensors entity embedding/structure module

The intended load path is the .ent wrapper in this project, which reconstructs the full model by combining:

HuggingFaceTB/SmolLM2-1.7B-Instruct
the LoRA adapter in lora/
the entity decoder
the entity projection layer

What This Model Is

The model augments a standard causal LM with a parallel entity stream:

tokens are mapped to hashed entity IDs
entity IDs are decoded into learned entity embeddings
entity embeddings are projected into the LM hidden space
projected entity features are added to token embeddings before generation

In the current .ent system, this model is used as one component inside a broader inference engine that includes:

abstraction layers
graph-based reasoning
working memory
durable semantic/procedural memory
simple program-like execution for arithmetic/logic tasks

Files

lora/adapter_model.safetensors: LoRA weights for SmolLM2
lora/adapter_config.json: PEFT adapter configuration
entity_proj.safetensors: learned entity-to-hidden projection
entity_decoder.safetensors: entity embedding/decoder module

Intended Use

This checkpoint is intended for:

experiments with entity-conditioned generation
.ent inference and evaluation
research on structured inference over hashed entities

It is not intended as a drop-in replacement for a normal text-generation model unless you also use the .ent loading code.

Loading

Project-native loading

from transformers import AutoTokenizer
from ent.training.train import EntitySmolWrapper

base_model = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model)
tokenizer.pad_token = tokenizer.eos_token

model = EntitySmolWrapper.from_pretrained(
    path="architecture/smolm-entity-v2",
    base_model_name=base_model,
    device="cpu",
    tokenizer=tokenizer,
)

lm_eval / `.ent` evaluation

This repository is also used through the .ent evaluation wrapper:

modal run ent/training/eval.py --model-path /data/output/smolm-entity-v2/final --tag ent-v2

Training Summary

This checkpoint was produced by fine-tuning SmolLM2 with:

frozen entity decoder
learned entity projection layer
LoRA adapters on the language model
entity-conditioned generation through hashed token/entity features

The training code lives in:

ent/training/train.py
ent/training/modal_train.py

Limitations

This is an experimental checkpoint.
It depends on external .ent loading code.
It is not a fully packaged standalone Transformers model repository.
The broader .ent system is still evolving, especially its inference, memory, and graph-reasoning components.

Evaluation Context

This model has been evaluated inside the .ent inference stack rather than only as a raw decoder. The broader project is moving toward an explicit inference architecture instead of relying purely on single-pass generation.

Citation

If you use this checkpoint, cite the repository/project that introduced the .ent architecture and this entity-conditioned SmolLM variant.

Downloads last month: -

Model tree for beaglabs/ent-smolm-entity-v2

Base model

HuggingFaceTB/SmolLM2-1.7B

Quantized

HuggingFaceTB/SmolLM2-1.7B-Instruct

Adapter

(34)

this model