poolside-banner

Release blog post · Technical report


Laguna M.1-base

Laguna M.1-base is the pre-trained base checkpoint for Laguna M.1, a 225B total parameter Mixture-of-Experts model with 23B activated parameters per token. This is the base model prior to post-training and reinforcement learning — it is a text-completion model with no instruction-following, reasoning, or tool-calling behavior. For agentic coding and chat use, use the post-trained Laguna M.1.

For details on how we trained Laguna, check out our release blog post and technical report.

Highlights

  • Large sparse MoE: Laguna M.1 is a 70-layer MoE transformer with 225B total parameters and 23B activated parameters per token
  • High-capacity expert routing: After 3 dense SwiGLU layers, Laguna M.1 uses 67 sparse MoE layers with 256 experts, top-k=16 routing and auxiliary-loss-free load balancing
  • Global attention architecture: Laguna M.1 uses global attention across all layers with 64 Q-heads, 8 KV-heads and softplus attention output gating
  • Pre-instruct base checkpoint: Trained with pre-training only; intended as a starting point for further training rather than direct chat/agentic use
  • Apache 2.0 license: Use and modify freely for commercial and non-commercial purposes

Model overview

  • Stage: pre-training only (no post-training or reinforcement learning)
  • Number of parameters: 225B total with 23B activated per token
  • Optimizer: Muon
  • Layers: 70 layers with global attention
  • Experts: 256 experts with 1 shared expert; top-k=16 routing
  • Dense layers: first 3 layers are dense SwiGLU; remaining 67 layers are sparse MoE
  • Attention: 64 Q-heads, 8 KV-heads, head dimension 128, with softplus attention output gating
  • Positional encoding: RoPE with YaRN
  • Modality: text-to-text (completion)
  • Context window: 262,144 tokens

Usage

Laguna M.1-base is a text-completion model. It has no chat template, reasoning, or tool-calling support — serve it without the reasoning/tool-call parsers and prompt it with raw text.

vLLM

Laguna support is available in vLLM (v0.21.0 and later, vllm-project/vllm#41129).

pip install 'vllm>=0.21.0'

vllm serve \
    --model poolside/Laguna-M.1-base \
    --served-model-name laguna-base

Query the completions endpoint from any OpenAI-compatible client:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")

completion = client.completions.create(
    model="laguna-base",
    prompt="def fibonacci(n):\n",
    max_tokens=128,
    temperature=0.7,
)
print(completion.choices[0].text)

SGLang

Laguna M.1-base is supported in SGLang via sgl-project/sglang#28400. As a completion model, serve it without the reasoning/tool-call parsers. A full serving recipe will be added here.

Transformers

Laguna is supported in Transformers v5.7.0 and later (huggingface/transformers#45673).

Laguna M.1-base is a 225B-parameter model; loading the BF16 checkpoint in Transformers requires substantial multi-GPU memory (device_map="auto" shards across available devices). For single-node serving, vLLM is recommended.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "poolside/Laguna-M.1-base"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="auto")

inputs = tokenizer("def fibonacci(n):\n", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License

This model is licensed under the Apache 2.0 License.

Intended and Responsible Use

Laguna M.1 is designed for software engineering and agentic coding use cases, and you are responsible for confirming that it is appropriate for your intended application. Laguna M.1 is subject to the Apache 2.0 License, and should be used consistently with Poolside's Acceptable Use Policy. We advise against circumventing Laguna M.1 safety guardrails without implementing substantially equivalent mitigations appropriate for your use case.

Please report security vulnerabilities or safety concerns to security@poolside.ai.

Downloads last month
9
Safetensors
Model size
226B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including poolside/Laguna-M.1-base