GGUF Model, Can I split it?

Max Allocated VRAM

Model (unquantized)

Context Size

Quantization Size

Batch Size

Model Size (GB)

4.20

Context Size (GB)

6.90

Total Size (GB)

420.69

Layer size (GB)

42.69

Layers offloaded to GPU