Mesh LLM

DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL across a local Mesh LLM cluster.

This package is derived from unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally UD-Q4_K_XL layer package

Model Overview

Property Value
Source model unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF
Model id unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF:UD-Q4_K_XL
Family DeepSeek
Parameter scale 70B
Quantization UD-Q4_K_XL
Layer count 80
Activation width 8192
Package size 40.3 GB
Source file DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL.gguf
Package repo meshllm/DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF:UD-Q4_K_XL",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF@main/DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL.gguf
Source revision main
Source SHA-256 30b01c505f810f02816b9adcd659246cfc0744572fa7ad60feed5bd9ca4e9662
Skippy ABI 0.1.24
Package manifest SHA-256 29727afc930095f84e2ecfdb0b5cd3ed84236728fad37b3a427f4c665d417a7c

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums 29727afc930095f84e2ecfdb0b5cd3ed84236728fad37b3a427f4c665d417a7c
Metadata shared/metadata.gguf 1 tensors, 7.5 MB e1d5f84bcf4ee2ed6a90cbb1e7b58a5a4a6db2209039b8786147a136c3de6e37
Embeddings shared/embeddings.gguf 2 tensors, 571.1 MB 5ea97740475f22fa617cd934fe09822b9c81fbb6a882a6467b3a5c5ad0785669
Output head shared/output.gguf 3 tensors, 829.4 MB 557d8ae4f584031cf2d198e8877e8dd8bf28ad5e1ed6e7603d34f231967f1916
Transformer layers layers/layer-*.gguf 80 layer artifacts, 800 tensors, 39.0 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main. Each artifact is checksummed as it is written, uploaded to this repository, and removed from the job workspace before the next artifact is produced.

skippy-model-package write-package "/source/DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL.gguf" --out-dir "/tmp/meshllm-layer-job-meshllm_DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL-layers-193/package"

Links

Downloads last month
3,048
GGUF
Model size
0.9B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for meshllm/DeepSeek-R1-Distill-Llama-70B-UD-Q4_K_XL-layers