Instructions to use beaglabs/ent-smolm-entity-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use beaglabs/ent-smolm-entity-v2 with PEFT:
Task type is invalid.
- Transformers
How to use beaglabs/ent-smolm-entity-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="beaglabs/ent-smolm-entity-v2")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("beaglabs/ent-smolm-entity-v2", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use beaglabs/ent-smolm-entity-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "beaglabs/ent-smolm-entity-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "beaglabs/ent-smolm-entity-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/beaglabs/ent-smolm-entity-v2
- SGLang
How to use beaglabs/ent-smolm-entity-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "beaglabs/ent-smolm-entity-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "beaglabs/ent-smolm-entity-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "beaglabs/ent-smolm-entity-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "beaglabs/ent-smolm-entity-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use beaglabs/ent-smolm-entity-v2 with Docker Model Runner:
docker model run hf.co/beaglabs/ent-smolm-entity-v2
.ent SmolLM Entity v2
.ent SmolLM Entity v2 is an experimental entity-conditioned language model built on top of HuggingFaceTB/SmolLM2-1.7B-Instruct.
This repository does not contain a full standalone Hugging Face model checkpoint. It contains the pieces needed by the .ent architecture:
- a LoRA adapter for the SmolLM2 base model
- an
entity_proj.safetensorsprojection layer - an
entity_decoder.safetensorsentity embedding/structure module
The intended load path is the .ent wrapper in this project, which reconstructs the full model by combining:
HuggingFaceTB/SmolLM2-1.7B-Instruct- the LoRA adapter in
lora/ - the entity decoder
- the entity projection layer
What This Model Is
The model augments a standard causal LM with a parallel entity stream:
- tokens are mapped to hashed entity IDs
- entity IDs are decoded into learned entity embeddings
- entity embeddings are projected into the LM hidden space
- projected entity features are added to token embeddings before generation
In the current .ent system, this model is used as one component inside a broader inference engine that includes:
- abstraction layers
- graph-based reasoning
- working memory
- durable semantic/procedural memory
- simple program-like execution for arithmetic/logic tasks
Files
lora/adapter_model.safetensors: LoRA weights for SmolLM2lora/adapter_config.json: PEFT adapter configurationentity_proj.safetensors: learned entity-to-hidden projectionentity_decoder.safetensors: entity embedding/decoder module
Intended Use
This checkpoint is intended for:
- experiments with entity-conditioned generation
.entinference and evaluation- research on structured inference over hashed entities
It is not intended as a drop-in replacement for a normal text-generation model unless you also use the .ent loading code.
Loading
Project-native loading
from transformers import AutoTokenizer
from ent.training.train import EntitySmolWrapper
base_model = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model)
tokenizer.pad_token = tokenizer.eos_token
model = EntitySmolWrapper.from_pretrained(
path="architecture/smolm-entity-v2",
base_model_name=base_model,
device="cpu",
tokenizer=tokenizer,
)
lm_eval / .ent evaluation
This repository is also used through the .ent evaluation wrapper:
modal run ent/training/eval.py --model-path /data/output/smolm-entity-v2/final --tag ent-v2
Training Summary
This checkpoint was produced by fine-tuning SmolLM2 with:
- frozen entity decoder
- learned entity projection layer
- LoRA adapters on the language model
- entity-conditioned generation through hashed token/entity features
The training code lives in:
ent/training/train.pyent/training/modal_train.py
Limitations
- This is an experimental checkpoint.
- It depends on external
.entloading code. - It is not a fully packaged standalone Transformers model repository.
- The broader
.entsystem is still evolving, especially its inference, memory, and graph-reasoning components.
Evaluation Context
This model has been evaluated inside the .ent inference stack rather than only as a raw decoder. The broader project is moving toward an explicit inference architecture instead of relying purely on single-pass generation.
Citation
If you use this checkpoint, cite the repository/project that introduced the .ent architecture and this entity-conditioned SmolLM variant.
- Downloads last month
- -
Model tree for beaglabs/ent-smolm-entity-v2
Base model
HuggingFaceTB/SmolLM2-1.7B