Mesh LLM

GLM-5.2-Q2_K-MTP-Q8

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running GLM-5.2-Q2_K-MTP-Q8 across a local Mesh LLM cluster.

This package is derived from meshllm/GLM-5.2-Q2_K-MTP-Q8-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally Q2_K layer package

Model Overview

Property Value
Source model meshllm/GLM-5.2-Q2_K-MTP-Q8-GGUF
Model id meshllm/GLM-5.2-Q2_K-MTP-Q8-GGUF:Q2_K-MTP-Q8
Family GLM
Parameter scale not recorded
Quantization Q2_K
Layer count 79
Activation width 6144
Package size 260.3 GB
Source file Q2_K-MTP-Q8/GLM-5.2-Q2_K-MTP-Q8-00001-of-00306.gguf
Package repo meshllm/GLM-5.2-Q2_K-MTP-Q8-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: meshllm/GLM-5.2-Q2_K-MTP-Q8-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/GLM-5.2-Q2_K-MTP-Q8-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meshllm/GLM-5.2-Q2_K-MTP-Q8-GGUF:Q2_K-MTP-Q8",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref meshllm/GLM-5.2-Q2_K-MTP-Q8-GGUF@main/Q2_K-MTP-Q8/GLM-5.2-Q2_K-MTP-Q8-00001-of-00306.gguf
Source revision main
Source SHA-256 6e1841a844cae68f434d7f699e1e974232a687e700ad7b26631d6880eb541b9a
Skippy ABI 0.1.27
Package manifest SHA-256 0df2893e5d1e553b9901c944d0ceae959ee69a949d8a2e69c645f8cae3d91257

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums 0df2893e5d1e553b9901c944d0ceae959ee69a949d8a2e69c645f8cae3d91257
Metadata shared/metadata.gguf 0 tensors, 9.0 MB d239e9f5bb3151e29fa2f1f55c53eff16af5737e06afe541c6f24016d74bc8d6
Embeddings shared/embeddings.gguf 1 tensors, 973.2 MB f5694c010726da17a33d68e0bf91566b91f9ccc9bfc08da393faf4c6cfa545b2
Output head shared/output.gguf 2 tensors, 1.8 GB b2e3fb2597217dd42464a10c02ec9a866571b37ea61c59d59a9ac84e29db6e50
Transformer layers layers/layer-*.gguf 79 layer artifacts, 1521 tensors, 257.6 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main. Each artifact is checksummed as it is written, uploaded to this repository, and removed from the job workspace before the next artifact is produced.

skippy-model-package write-package "/source/Q2_K-MTP-Q8/GLM-5.2-Q2_K-MTP-Q8-00001-of-00306.gguf" --out-dir "/tmp/meshllm-layer-job-meshllm_GLM-5.2-Q2_K-MTP-Q8-layers-193/package"

Links

Downloads last month
823
GGUF
Model size
0.4B params
Architecture
glm-dsa
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for meshllm/GLM-5.2-Q2_K-MTP-Q8-layers

Quantized
(1)
this model