poolside/Laguna-XS-2.1-DFlash-FP8
DFlash speculator for the FP8 target poolside/Laguna-XS-2.1-FP8. The speculator itself is a 5-layer Llama-style draft model (in BF16); pair it with the FP8 base for lower-latency serving.
Speculators for the other precisions are available in this collection: BF16, INT4, NVFP4.
See the Laguna XS 2.1 DFlash speculator model card for architecture, training, and deployment. DFlash upstream support is in progress (vLLM #46853, SGLang #29446, TRT-LLM #15666). Use poolside/Laguna-XS-2.1-FP8 as the target model.
License
This model is licensed under the OpenMDW-1.1 License.
Intended and Responsible Use
Laguna-XS-2.1-DFlash-FP8 is designed for software engineering and agentic coding use cases, and you are responsible for confirming that it is appropriate for your intended application. Laguna-XS-2.1-DFlash-FP8 is subject to the OpenMDW-1.1 License, and should be used consistently with Poolside's Acceptable Use Policy.
Please report security vulnerabilities or safety concerns to security@poolside.ai.
- Downloads last month
- 104