File size: 2,190 Bytes
a8a303f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
## What Works

| Loader         | Loading 1 LoRA | Loading 2 or more LoRAs | Training LoRAs | Multimodal extension | Perplexity evaluation |
|----------------|----------------|-------------------------|----------------|----------------------|-----------------------|
| Transformers   |       โœ…       |           โœ…\*\*\*      |       โœ…\*     |          โœ…          |           โœ…          |
| llama.cpp      |       โŒ       |           โŒ            |       โŒ       |          โŒ          |    use llamacpp_HF    |
| llamacpp_HF    |       โŒ       |           โŒ            |       โŒ       |          โŒ          |           โœ…          |
| ExLlamav2_HF   |       โœ…       |           โœ…            |       โŒ       |          โŒ          |           โœ…          |
| ExLlamav2      |       โœ…       |           โœ…            |       โŒ       |          โŒ          |   use ExLlamav2_HF    |
| AutoGPTQ       |       โœ…       |           โŒ            |       โŒ       |          โœ…          |           โœ…          |
| AutoAWQ        |       ?        |           โŒ            |       ?        |          ?           |           โœ…          |
| GPTQ-for-LLaMa |       โœ…\*\*   |           โœ…\*\*\*      |       โœ…       |          โœ…          |           โœ…          |
| ctransformers  |       โŒ       |           โŒ            |       โŒ       |          โŒ          |           โŒ          |
| QuIP#          |       ?        |           ?             |       ?        |          ?           |           โœ…          |
| HQQ            |       ?        |           ?             |       ?        |          ?           |           โœ…          |

โŒ = not implemented

โœ… = implemented

\* Training LoRAs with GPTQ models also works with the Transformers loader. Make sure to check "auto-devices" and "disable_exllama" before loading the model.

\*\* Requires the monkey-patch. The instructions can be found [here](https://github.com/oobabooga/text-generation-webui/wiki/08-%E2%80%90-Additional-Tips#using-loras-with-gptq-for-llama).

\*\*\* Multi-LoRA in PEFT is tricky and the current implementation does not work reliably in all cases.