Support for tensor-rt llm

#1
by ahad992 - opened

can i run this model on tensor-RT LLM

Hi, thanks for your interest β€” these model files are specifically for llama.cpp-based or GGUF-compatible clients.

As far as I'm aware, TensorRT-LLM is not among those, but from what I can tell it should be able to run the original model, though you may need to quantize it yourself depending on if you've got enough VRAM to fully offload it!

Sign up or log in to comment