puneeshkhanna commited on
Commit
0753ac9
1 Parent(s): 8b9bebc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -17,11 +17,11 @@ Falcon3-7B-Instruct supports 4 languages (english, french, spanish, portuguese)
17
 
18
  ## Model Details
19
  - Architecture
20
- - transformer based causal decoder only architecture
21
  - 28 decoder blocks
22
- - grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
23
- - wider head dimension: 256
24
- - high RoPE value to support long context understanding: 1000042
25
  - 32k context length
26
  - 131k vocab size
27
  - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
 
17
 
18
  ## Model Details
19
  - Architecture
20
+ - Transformer based causal decoder only architecture
21
  - 28 decoder blocks
22
+ - Grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
23
+ - Wider head dimension: 256
24
+ - High RoPE value to support long context understanding: 1000042
25
  - 32k context length
26
  - 131k vocab size
27
  - Pretrained on 14 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips