Instructions to use ttrpg/mosslight-4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ttrpg/mosslight-4b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="ttrpg/mosslight-4b") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("ttrpg/mosslight-4b") model = AutoModelForImageTextToText.from_pretrained("ttrpg/mosslight-4b") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ttrpg/mosslight-4b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ttrpg/mosslight-4b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ttrpg/mosslight-4b", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/ttrpg/mosslight-4b
- SGLang
How to use ttrpg/mosslight-4b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ttrpg/mosslight-4b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ttrpg/mosslight-4b", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ttrpg/mosslight-4b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ttrpg/mosslight-4b", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use ttrpg/mosslight-4b with Docker Model Runner:
docker model run hf.co/ttrpg/mosslight-4b
Mosslight 4B
Mosslight 4B is a fine-tuned, merged derivative of Qwen3.5-4B, packaged in Hugging Face Transformers format for local inference, serving, and downstream experimentation.
This repository contains the model weights, tokenizer, chat template, and multimodal preprocessor files needed to load the model with compatible Qwen3.5 tooling.
Model Details
- Model name: Mosslight 4B
- Model ID:
ttrpg/mosslight-4b - Base model:
Qwen/Qwen3.5-4B - Derivative type: fine-tuned and merged full-weight release
- Architecture:
Qwen3_5ForConditionalGeneration - Model type: vision-language causal generation
- Parameters: approximately 4B
- Native context length: 262,144 tokens, as inherited from the base config
- License: Apache 2.0, inherited from the base model
Lineage
This model is a fine-tuned, merged derivative of Qwen3.5-4B from Alibaba
Cloud/Qwen. The original Apache 2.0 license is preserved in LICENSE, and
derivative attribution is documented in NOTICE.
Training and merge details should be completed before publishing a final public version.
Training Details
- Base checkpoint:
Qwen/Qwen3.5-4B - Fine-tuning method: TODO
- Training data: TODO
- Merge method: TODO
- Output format: merged full weights in sharded Safetensors format
- Post-training evaluation: TODO
Files
config.json: model architecture and multimodal configuration.model.safetensors-00001-of-00002.safetensorsmodel.safetensors-00002-of-00002.safetensorsmodel.safetensors.index.jsontokenizer.json,tokenizer_config.json,vocab.json,merges.txtchat_template.jinjapreprocessor_config.json,video_preprocessor_config.jsonLICENSE,NOTICE
Usage
Install a Transformers build that supports Qwen3.5, then load the model using the standard Hugging Face APIs.
from transformers import AutoProcessor, AutoModelForImageTextToText
model_id = "ttrpg/mosslight-4b"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
model_id,
device_map="auto",
torch_dtype="auto",
trust_remote_code=True,
)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Briefly introduce yourself."},
],
}
]
inputs = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(outputs[0], skip_special_tokens=True))
Serving
Use serving frameworks only after confirming they support Qwen3.5 model classes and the required multimodal processor files.
Example model identifier:
ttrpg/mosslight-4b
Intended Use
Mosslight 4B is intended for experimentation with compact multimodal assistant workflows, text generation, visual question answering, and local model serving.
Limitations
- No independent benchmark results are published for this custom release yet.
- Behavior and safety characteristics should be evaluated for your target use case before deployment.
- This model inherits limitations from the Qwen3.5-4B base model and from the fine-tuning and merge process used for this release.
Attribution
Mosslight 4B is a fine-tuned, merged derivative based on Qwen3.5-4B. Please retain the Apache 2.0 license and attribution notices when redistributing this model or derivatives of it.
- Downloads last month
- 19