Nbardy commited on
Commit
8ebf499
1 Parent(s): f8304d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -7,13 +7,14 @@ language:
7
  - en
8
  ---
9
  Micro Mistral
10
- This is a small mistral model with 6 layers
11
 
12
- This architecture takes GQA and tied embeddings to create an effeceint 0.5B model that uses the mistral architecture(It is supported in downstream applications).
13
 
14
- Uses GQA, tied embeddings, and sliding window attention.
15
 
16
  Dataset
17
  Minipile Instruct Math OpenOrca Synthetic Data
18
 
 
 
19
  TODO: Complete Dataset section
 
7
  - en
8
  ---
9
  Micro Mistral
 
10
 
11
+ A small version of mistral.
12
 
13
+ Similiar to some of the small llama variants, but uses GQA, tied embeddings, and sliding window attention.
14
 
15
  Dataset
16
  Minipile Instruct Math OpenOrca Synthetic Data
17
 
18
+
19
+
20
  TODO: Complete Dataset section