Support for tensor-rt llm

by ahad992 - opened Mar 22

Discussion

ahad992

Mar 22

can i run this model on tensor-RT LLM

brittlewis12

Owner Apr 22

Hi, thanks for your interest — these model files are specifically for llama.cpp-based or GGUF-compatible clients.

As far as I'm aware, TensorRT-LLM is not among those, but from what I can tell it should be able to run the original model, though you may need to quantize it yourself depending on if you've got enough VRAM to fully offload it!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment