How to deploy?

#1
by LetsJumP - opened

Seems great!
Any deployment guidance?

works with just like minimax itself vLLM, you might need to pass --tokenizer MiniMaxAI/MiniMax-M2.7 for it to work, but might also work without, i haven't tested that!
For 8xRTX PRO 6000 I use:

SAFETENSORS_FAST_GPU=1 \
vllm serve selimaktas/MiniMax-M2.75-460B-A20B \
    --tokenizer MiniMaxAI/MiniMax-M2.7 \
    --tensor-parallel-size 4 \
    --data-parallel-size 2 \
    --enable-expert-parallel \
    --gpu-memory-utilization 0.92 \
    --enable-auto-tool-choice \
    --tool-call-parser minimax_m2 \
    --reasoning-parser minimax_m2 \
    --trust-remote-code

any possibility for nvfp4?

Sign up or log in to comment