FP8 version made for LocateAnything

made by https://github.com/WuNein/LocateAnything-vLLM

vllm serve locate_qwen2_model --tensor-parallel-size 1 --max-model-len 8192 --gpu-memory-utilization 0.5 --kv-cache-dtype auto --max-model-len 16384 --max-num-seqs 32 --max-cudagraph-capture-size 32 --enable-prompt-embeds

Downloads last month: 4

Safetensors

Model size

3B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shigureui/LocateAnything-Qwen2-FP8

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

nvidia/LocateAnything-3B

Quantized

(10)

this model