Qwen3.6-27B Layerdose

LayerDose replacement candidate derived from Qwen/Qwen3.6-27B.

This is not a plain dense checkpoint. Fourteen low-risk linear_attn operators were replaced with compact rank-64 patch modules. The checkpoint stores the reduced model shards plus layerdose_patches.safetensors and layerdose_replacement_manifest.json.

Important Loading Note

A plain AutoModelForCausalLM.from_pretrained() load is not enough to activate the replacements. Load the model, then install the LayerDose patch stack from this model directory:

from layerdose.materialized import install_sublayer_patch_stack

model = ...  # load the Qwen/Qwen3.6-compatible model
install_sublayer_patch_stack(model, "/path/to/Qwen3.6-27B-LayerDose-14LinearAttnRank64")

The GGUF build requires the patched llama.cpp source published with the GGUF repository.

Size

  • Safetensors directory: about 49 GB on disk
  • Removed/replaced checkpoint tensor storage: about 3.28 GB before patch tensors
  • Patch tensor file: about 27 MB

Smoke Results

Post-GGUF smoke checks comparing LayerDose BF16 GGUF to LayerDose Q4_K_M GGUF:

  • ARC64-95: BF16 0.9375, Q4 0.9375, delta 0.0000
  • HellaSwag64-95: BF16 0.5625, Q4 0.5625, delta 0.0000
  • Combined option KL BF16->Q4: 0.008657
  • LM KL Q4-vs-BF16 on 16 held-out prompts: 0.017597 +/- 0.001729
  • Same-top-token rate: 96.429%
  • Greedy generation smoke passed for both BF16 and Q4

This is a smoke-level validation, not a full benchmark suite.

Provenance

The candidate was built by LayerDose using measured coarse-to-fine importance ranking and operator-specific replacement. The selected target was the 14-layer linear-attn replacement candidate.

Downloads last month
29
Safetensors
Model size
26B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Etherll/Qwen3.6-27B-Layerdose

Base model

Qwen/Qwen3.6-27B
Finetuned
(260)
this model