FP8 version made for LocateAnything

made by https://github.com/WuNein/LocateAnything-vLLM

vllm serve locate_qwen2_model --tensor-parallel-size 1 --max-model-len 8192 --gpu-memory-utilization 0.5 --kv-cache-dtype auto --max-model-len 16384 --max-num-seqs 32 --max-cudagraph-capture-size 32 --enable-prompt-embeds

Downloads last month
4
Safetensors
Model size
3B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shigureui/LocateAnything-Qwen2-FP8

Base model

Qwen/Qwen2.5-3B
Quantized
(10)
this model