poolside/Laguna-XS-2.1-DFlash-INT4

DFlash speculator for the INT4 target poolside/Laguna-XS-2.1-INT4. The speculator itself is a 5-layer Llama-style draft model (in BF16); pair it with the INT4 base for lower-latency serving.

Speculators for the other precisions are available in this collection: BF16, FP8, NVFP4.

See the Laguna XS 2.1 DFlash speculator model card for architecture, training, and deployment. DFlash upstream support is in progress (vLLM #46853, SGLang #29446, TRT-LLM #15666). Use poolside/Laguna-XS-2.1-INT4 as the target model.

License

This model is licensed under the OpenMDW-1.1 License.

Intended and Responsible Use

Laguna-XS-2.1-DFlash-INT4 is designed for software engineering and agentic coding use cases, and you are responsible for confirming that it is appropriate for your intended application. Laguna-XS-2.1-DFlash-INT4 is subject to the OpenMDW-1.1 License, and should be used consistently with Poolside's Acceptable Use Policy.

Please report security vulnerabilities or safety concerns to security@poolside.ai.

Downloads last month
42
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for poolside/Laguna-XS-2.1-DFlash-INT4

Finetuned
(1)
this model

Collection including poolside/Laguna-XS-2.1-DFlash-INT4