Havmand commited on
Commit
6113c56
1 Parent(s): c7dfa15

initial release of minillama

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +56 -0
  3. minillama.gguf +3 -0
  4. training.txt +1 -0
.gitattributes CHANGED
@@ -1,6 +1,7 @@
1
  *.7z filter=lfs diff=lfs merge=lfs -text
2
  *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
 
4
  *.bz2 filter=lfs diff=lfs merge=lfs -text
5
  *.ckpt filter=lfs diff=lfs merge=lfs -text
6
  *.ftz filter=lfs diff=lfs merge=lfs -text
 
1
  *.7z filter=lfs diff=lfs merge=lfs -text
2
  *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.gguf filter=lfs diff=lfs merge=lfs -text
5
  *.bz2 filter=lfs diff=lfs merge=lfs -text
6
  *.ckpt filter=lfs diff=lfs merge=lfs -text
7
  *.ftz filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,59 @@
1
  ---
 
 
 
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ inference: true
3
+ language:
4
+ - en
5
  license: mit
6
+ model_creator: Mads Havmand
7
+ model_name: minillama
8
+ model_type: llama
9
+ quantized_by: Havmand
10
+ tags:
11
+ - llama
12
+ - test
13
+ - development
14
  ---
15
+
16
+ # minillama
17
+ - Model creator: [Mads Havmand](https://huggingface.co/Havmand)
18
+
19
+ ## Description
20
+
21
+ minillama is a minimal Large Language Model using the Llama architecture and distributed in the GGUF format.
22
+
23
+ The purpose of the model is to be small and technically qualify as a model that can be loaded with llama.cpp without causing an error.
24
+ I originally created this model because I needed a small model for my unit tests of Python code that used llama-cpp-python.
25
+
26
+ The model __can technically__ be used for inference, but the output produced is a close to useless as you can get.
27
+ Tokens per second is nice though, at around 1000 tokens per second on an Apple M2 Pro.
28
+
29
+ To reduce file size, the model is quantized using Q2_K.
30
+
31
+ The model contains 4.26 million parameters and is 3.26 MiB.
32
+
33
+ As for the vocabulary, the model uses the llama vocabulary provided by [llama.cpp](https://github.com/ggerganov/llama.cpp/blob/97c1549808d2742d37584a3c9df28154bdf34417/models/ggml-vocab-llama.gguf) (SHA512: `38a5acf305050422882044df0acc97e5ae992ed19b2838b3b58ebbbb1f61c59bfc12a6f686a724aed32227045806e4dd46aadf9822155d1169455fa56d38fbc2`)
34
+
35
+ The training corpus consists of a space and a newline:
36
+
37
+ ```hexdump
38
+ 00000000 20 0a | .|
39
+ 00000002
40
+ ```
41
+
42
+ Finally, the model was build using llama.cpp's `train-text-from-scratch` (from commit [97c1549808d2742d37584a3c9df28154bdf34417](https://github.com/ggerganov/llama.cpp/tree/97c1549808d2742d37584a3c9df28154bdf34417)). The command used was:
43
+
44
+ ```sh
45
+ ./train-text-from-scratch \
46
+ --vocab-model models/ggml-vocab-llama.gguf \
47
+ --ctx 1 --embd 64 --head 1 --layer 1 \
48
+ --checkpoint-in chk-minillama-LATEST.gguf \
49
+ --checkpoint-out chk-minillama-ITERATION.gguf \
50
+ --model-out ggml-minillama-f32-ITERATION.gguf \
51
+ --train-data "training.txt" \
52
+ -t 6 -b 16 --seed 1 --adam-iter 1 \
53
+ --no-checkpointing
54
+ ```
55
+
56
+ Quantization happened using `./quantize ggml-minillama-f32-LATEST.gguf 10`.
57
+
58
+ These files were quantized using hardware kindly provided by me.
59
+
minillama.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b75b6525e450d261d47552b9ba1ddda669889371526082cde7bb6a2d114efc3b
3
+ size 4139456
training.txt ADDED
@@ -0,0 +1 @@
 
 
1
+