cgus
/

SOLAR-10.7B-Instruct-v1.0-128k-exl2

Text Generation

text-generation-inference

Model card Files Files and versions Community

cgus commited on Jan 23

Commit

8e45b1c

•

1 Parent(s): 0027136

Update README.md

Files changed (1) hide show

README.md +0 -1

README.md CHANGED Viewed

@@ -31,7 +31,6 @@ Quantized with Exllamav2 0.0.11 with default dataset.
 I tried to load 4bpw version of the model in Text-Generation-WebUI but it didn't set RoPE scaling automatically despite it being defined in the config file.
 With high context it starts writing gibberish when RoPE scaling isn't set, so I checked it with 4x compress_pos_emb for 32k max context and it was able to retrieve details from 16000 token prompt.
 With my 12GB VRAM GPU I could load the model with about 30000 tokens or 32768 tokens with 8bit cache option.
-It's the first Yarn model that worked for me, perhaps other Yarn models required to set RoPE scaling manually too.
 ## How to run

 I tried to load 4bpw version of the model in Text-Generation-WebUI but it didn't set RoPE scaling automatically despite it being defined in the config file.
 With high context it starts writing gibberish when RoPE scaling isn't set, so I checked it with 4x compress_pos_emb for 32k max context and it was able to retrieve details from 16000 token prompt.
 With my 12GB VRAM GPU I could load the model with about 30000 tokens or 32768 tokens with 8bit cache option.
 ## How to run