Instructions to use moonshotai/Kimi-K2.7-Code with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use moonshotai/Kimi-K2.7-Code with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="moonshotai/Kimi-K2.7-Code", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("moonshotai/Kimi-K2.7-Code", trust_remote_code=True, dtype="auto") - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use moonshotai/Kimi-K2.7-Code with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "moonshotai/Kimi-K2.7-Code" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moonshotai/Kimi-K2.7-Code", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/moonshotai/Kimi-K2.7-Code
- SGLang
How to use moonshotai/Kimi-K2.7-Code with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "moonshotai/Kimi-K2.7-Code" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moonshotai/Kimi-K2.7-Code", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "moonshotai/Kimi-K2.7-Code" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moonshotai/Kimi-K2.7-Code", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use moonshotai/Kimi-K2.7-Code with Docker Model Runner:
docker model run hf.co/moonshotai/Kimi-K2.7-Code
Consumer sized versions? (26B A3B Versions)
I was wondering if it would be possible for moonshootai to create consumer GPU versions of the kimi model? This model is so good, it is sad that we don't get smaller versions for consumers to be able to run themselves. Right now this class of model seems to be only usable if you have enterprise funding or are wealthy with disposable income.
I know this isn't the typical pattern of releases you guys do, but it would be really appreciated. π
Even a version a quarter of the size would at least allow running it on small enterprises setups. I don't have dozens of H200 lying arround, even in our small DC.
I was wondering if it would be possible for moonshootai to create consumer GPU versions of the kimi model? This model is so good, it is sad that we don't get smaller versions for consumers to be able to run themselves. Right now this class of model seems to be only usable if you have enterprise funding or are wealthy with disposable income.
I know this isn't the typical pattern of releases you guys do, but it would be really appreciated. π
You already have Qwen 3.6 models.
@daniel-dona to be fair, no 135b or 397b sized Qwen models were released last time, which I'd love to have. This is kind of a forgotten size right now. You either have 40b-ish models (and lower), or 750b and above.
The middle is empty.
