peruginia commited on
Commit
447d953
·
verified ·
1 Parent(s): d7e183d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -10
README.md CHANGED
@@ -2,6 +2,7 @@
2
  language:
3
  - it
4
  pipeline_tag: text-generation
 
5
  widget:
6
  - text: Alessandro è un ragazzo che progetta Infissi
7
  - text: Melissa è una ragazza che adora
@@ -19,21 +20,17 @@ More precise versions will be published shortly.
19
  Train on my server, i have studied and adapted the model starting from the repository https://github.com/karpathy/llama2.c
20
 
21
  # max_seq_len: 7b = 2048: It represents the maximum sequence length for input data.
22
- max_seq_len = 1024 #7b=2048
23
-
24
  # dim 7b= 4096: This attribute represents the dimensionality of the model
25
- dim = 768
26
-
27
  # n_layers: 7b = 32: It specifies the number of layers in the model
28
- n_layers = 32
29
-
30
  # n_heads: 7b = 32: This attribute determines the number of attention heads in the model
31
- n_heads = 32
32
-
33
  # n_kv_heads: 7b = 32: It represents the number of key and value heads,
34
- n_kv_heads = 32
35
-
36
  # multiple_of: 7b = 256: It specifies a value used to make the SwiGLU hidden layer size a multiple of a large power of 2
 
 
 
 
 
 
37
  multiple_of = 32
38
 
39
  num decayed parameter tensors: 225, with 251,068,416 parameters
 
2
  language:
3
  - it
4
  pipeline_tag: text-generation
5
+ max_length: 100
6
  widget:
7
  - text: Alessandro è un ragazzo che progetta Infissi
8
  - text: Melissa è una ragazza che adora
 
20
  Train on my server, i have studied and adapted the model starting from the repository https://github.com/karpathy/llama2.c
21
 
22
  # max_seq_len: 7b = 2048: It represents the maximum sequence length for input data.
 
 
23
  # dim 7b= 4096: This attribute represents the dimensionality of the model
 
 
24
  # n_layers: 7b = 32: It specifies the number of layers in the model
 
 
25
  # n_heads: 7b = 32: This attribute determines the number of attention heads in the model
 
 
26
  # n_kv_heads: 7b = 32: It represents the number of key and value heads,
 
 
27
  # multiple_of: 7b = 256: It specifies a value used to make the SwiGLU hidden layer size a multiple of a large power of 2
28
+
29
+ max_seq_len = 1024
30
+ dim = 768
31
+ n_layers = 32
32
+ n_heads = 32
33
+ n_kv_heads = 32
34
  multiple_of = 32
35
 
36
  num decayed parameter tensors: 225, with 251,068,416 parameters