Custom 4-bit Finetuning 5-7 times faster inference than QLora

pinned

by rmihaylov - opened May 31, 2023

May 31, 2023

FalconLLM pinned discussion Jun 9, 2023

Jul 20, 2023

Loading in the model went from 5-10 mins to 30 seconds

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment