qwerrwe / examples /jamba /README.md
winglian's picture
fix some of the edge cases for Jamba (#1452)
05b398a unverified
|
raw
history blame
318 Bytes
# Jamba
- βœ… qlora w/ deepspeed Zero-2 needs at least 2x GPUs and
- 35GiB VRAM per GPU w minimal context length
- 56GiB VRAM per GPU (w multipack enabled)
- βœ… qlora w/ deepspeed Zero-3 needs at least 2x GPUs and 67GiB VRAM (wtf?)
- βœ… qlora single-gpu, ~51GiB VRAM
- βœ… multipack
- ❓ FSDP
- ❓ 8-bit LoRA