petermca commited on
Commit
0bde075
1 Parent(s): f6b63ff

Update config.json for flan-t5-small

Browse files

I believe the num_heads and num_layers values are swapped for google/flan-t5-small. See the comparison for t5-small (link below) which flan-t5-small is based off. With the current values, the hidden size of the model isn't divisible by the number of attention heads (512 % 6 = 2).

https://huggingface.co/t5-small/blob/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/config.json#L16

Files changed (1) hide show
  1. config.json +2 -2
config.json CHANGED
@@ -15,8 +15,8 @@
15
  "model_type": "t5",
16
  "n_positions": 512,
17
  "num_decoder_layers": 8,
18
- "num_heads": 6,
19
- "num_layers": 8,
20
  "output_past": true,
21
  "pad_token_id": 0,
22
  "relative_attention_max_distance": 128,
 
15
  "model_type": "t5",
16
  "n_positions": 512,
17
  "num_decoder_layers": 8,
18
+ "num_heads": 8,
19
+ "num_layers": 6,
20
  "output_past": true,
21
  "pad_token_id": 0,
22
  "relative_attention_max_distance": 128,