Text Generation
Transformers
bloom
Eval Results
text-generation-inference
TheBloke commited on
Commit
d1603ef
1 Parent(s): 2d647fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -695,17 +695,17 @@ I did this using the simple *nix command `split`.
695
 
696
  To join the files on any *nix system, run:
697
  ```
698
- cat gptq_model-4bit--1g.split* > gptq_model-4bit--1g.safetensors
699
  ```
700
 
701
  To join the files on Windows, open a Command Prompt and run:
702
  ```
703
- COPY /B gptq_model-4bit--1g.splitaa + gptq_model-4bit--1g.splitab + gptq_model-4bit--1g.splitac gptq_model-4bit--1g.safetensors
704
  ```
705
 
706
  The SHA256SUM of the joined file will be:
707
 
708
- Once you have the joined file, you can safely delete `gptq_model-4bit--1g.split*`.
709
 
710
  ## Repositories available
711
 
@@ -714,11 +714,11 @@ Once you have the joined file, you can safely delete `gptq_model-4bit--1g.split*
714
 
715
  ## Two files provided - separate branches
716
 
717
- - Main branch:
718
  - Group Size = None
719
  - Desc Act (act-order) = True
720
  - This version will use the least possible VRAM, and should have higher inference performance in CUDA mode
721
- - Branch `group_size_128g`:
722
  - Group Size = 128g
723
  - Desc Act (act-oder) = True
724
  - This version will use more VRAM, which shouldn't be a problem as it shouldn't exceed 2 x 80GB or 3 x 48GB cards.
 
695
 
696
  To join the files on any *nix system, run:
697
  ```
698
+ cat gptq_model-4bit--1g.JOINBEFOREUSE.split-*.safetensors > gptq_model-4bit--1g.safetensors
699
  ```
700
 
701
  To join the files on Windows, open a Command Prompt and run:
702
  ```
703
+ COPY /B gptq_model-4bit--1g.JOINBEFOREUSE.split-a.safetensors + gptq_model-4bit--1g.JOINBEFOREUSE.split-b.safetensors + gptq_model-4bit--1g.JOINBEFOREUSE.split-c.safetensors gptq_model-4bit--1g.safetensors
704
  ```
705
 
706
  The SHA256SUM of the joined file will be:
707
 
708
+ Once you have the joined file, you can safely delete `gptq_model-4bit--1g.JOINBEFOREUSE.split-*.safetensors`.
709
 
710
  ## Repositories available
711
 
 
714
 
715
  ## Two files provided - separate branches
716
 
717
+ - Main branch: `gptq_model-4bit--1g.safetensors`
718
  - Group Size = None
719
  - Desc Act (act-order) = True
720
  - This version will use the least possible VRAM, and should have higher inference performance in CUDA mode
721
+ - Branch `group_size_128g`: `gptq_model-4bit-128g.safetensors`
722
  - Group Size = 128g
723
  - Desc Act (act-oder) = True
724
  - This version will use more VRAM, which shouldn't be a problem as it shouldn't exceed 2 x 80GB or 3 x 48GB cards.