osanseviero danielhanchen commited on
Commit
1937c70
1 Parent(s): 32e4f3e

9B - query_pre_attn_scalar = 256 not 224 (#26)

Browse files

- 9B - query_pre_attn_scalar = 256 not 224 (3986e2e7f61420f53a9b90d39c88dd462fc8cc51)


Co-authored-by: Daniel Han-Chen <danielhanchen@users.noreply.huggingface.co>

Files changed (1) hide show
  1. config.json +1 -1
config.json CHANGED
@@ -21,7 +21,7 @@
21
  "num_hidden_layers": 42,
22
  "num_key_value_heads": 8,
23
  "pad_token_id": 0,
24
- "query_pre_attn_scalar": 224,
25
  "rms_norm_eps": 1e-06,
26
  "rope_theta": 10000.0,
27
  "sliding_window": 4096,
 
21
  "num_hidden_layers": 42,
22
  "num_key_value_heads": 8,
23
  "pad_token_id": 0,
24
+ "query_pre_attn_scalar": 256,
25
  "rms_norm_eps": 1e-06,
26
  "rope_theta": 10000.0,
27
  "sliding_window": 4096,