poolside/Laguna-XS-2.1-DFlash-INT4

DFlash speculator for the INT4 target poolside/Laguna-XS-2.1-INT4. The speculator itself is a 5-layer Llama-style draft model (in BF16); pair it with the INT4 base for lower-latency serving.

Speculators for the other precisions are available in this collection: BF16, FP8, NVFP4.

See the Laguna XS 2.1 DFlash speculator model card for architecture, training, and deployment. DFlash upstream support is in progress (vLLM #46853, SGLang #29446, TRT-LLM #15666). Use poolside/Laguna-XS-2.1-INT4 as the target model.

License

This model is licensed under the OpenMDW-1.1 License.

Intended and Responsible Use

Laguna-XS-2.1-DFlash-INT4 is designed for software engineering and agentic coding use cases, and you are responsible for confirming that it is appropriate for your intended application. Laguna-XS-2.1-DFlash-INT4 is subject to the OpenMDW-1.1 License, and should be used consistently with Poolside's Acceptable Use Policy.

Please report security vulnerabilities or safety concerns to security@poolside.ai.