ProphetOfBostrom commited on
Commit
7d0fc1f
1 Parent(s): 4a3634a

Create MISLEAD.md

Browse files

i've always wanted a blog. this file is some notes on what i 'learned' during the HQQ quantize process.

Files changed (1) hide show
  1. MISLEAD.md +19 -0
MISLEAD.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ It took me all day to figure this out. It turns out that while HQQ will go ahead and fill 180GB of memory to do this - there's absolutely no reason for it! I did this from a slow**, 200 GB swap partition.
2
+ On the off chance someone at Mobius see this - please don't ask transformers to load a 45B param model on to the CPU if you're not actually going to... call the model at all? It took ten minutes at SATA 2 speeds - and that was because it was padded to FP32 (CPU mode, right?).
3
+
4
+ 45 Gigaweights \* 2 Bytes per weight \* fp32/bf16 = 180 GB of system memory allocated.
5
+
6
+ I wish I had that.
7
+
8
+ \**May have been zswap's fault. I'm pretty sure 200MB/s and an idle CPU isn't the best you can hope for when you're doing sequential reads from a 4.0x4 NVME device? My GPU fell asleep between optimization passes. It even has a Gamer LED on it. I'll fix my sysctl next time.
9
+
10
+ + Try `$ python -i untitled.py`
11
+
12
+ having saved that script from the mobius hf repo because you'll be spending a while in IDLE figuring out
13
+ + `>>> model.save_quantized("/absolute/path/noromaid") `
14
+
15
+ at the end and trust me, quantizing something chunky and then watching python shred it because the save directory is somehow a recursive lambda function and not a string is heartbreaking. I don't know if it was supposed to emit more than the model.pt and the config.json but I'm taking what I can get.
16
+
17
+ ###### If anyone's looking to donate I could do with an Epyc Rome and perhaps another pair of H100s? I've embedded my XMR address in attention tensors with help from a realy horny embedding so when it starts generating gibberish right before the good stuff just paste that in to feather and send me all your money. Thanks! :)
18
+
19
+ `i'm joking. that's a joke. I didn't do that.`