NETO_AI / docs /What Works.md
Applesam4's picture
Upload 299 files
321be7f verified
|
raw
history blame
No virus
2.19 kB

What Works

Loader Loading 1 LoRA Loading 2 or more LoRAs Training LoRAs Multimodal extension Perplexity evaluation
Transformers βœ… βœ…*** βœ…* βœ… βœ…
llama.cpp ❌ ❌ ❌ ❌ use llamacpp_HF
llamacpp_HF ❌ ❌ ❌ ❌ βœ…
ExLlamav2_HF βœ… βœ… ❌ ❌ βœ…
ExLlamav2 βœ… βœ… ❌ ❌ use ExLlamav2_HF
AutoGPTQ βœ… ❌ ❌ βœ… βœ…
AutoAWQ ? ❌ ? ? βœ…
GPTQ-for-LLaMa βœ…** βœ…*** βœ… βœ… βœ…
ctransformers ❌ ❌ ❌ ❌ ❌
QuIP# ? ? ? ? βœ…
HQQ ? ? ? ? βœ…

❌ = not implemented

βœ… = implemented

* Training LoRAs with GPTQ models also works with the Transformers loader. Make sure to check "auto-devices" and "disable_exllama" before loading the model.

** Requires the monkey-patch. The instructions can be found here.

*** Multi-LoRA in PEFT is tricky and the current implementation does not work reliably in all cases.