Triton inference

by SinanAkkoyun - opened May 26, 2023

May 26, 2023

Hi! What is the fastest inference code available right now? Also, can this be used with NVIDIAs FasterTransformer inference code?

Technology Innovation Institute org May 30, 2023

There is an upcoming integration in text-generation-inference that should be lightning fast: https://github.com/huggingface/text-generation-inference/pull/379 :)

FalconLLM changed discussion status to closed May 30, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment