afrideva commited on
Commit
d2206a6
1 Parent(s): b144ed9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Maykeye/TinyLLama-v0
3
+ inference: false
4
+ license: apache-2.0
5
+ model_creator: Maykeye
6
+ model_name: TinyLLama-v0
7
+ pipeline_tag: text-generation
8
+ quantized_by: afrideva
9
+ tags:
10
+ - gguf
11
+ - ggml
12
+ - quantized
13
+ - q2_k
14
+ - q3_k_m
15
+ - q4_k_m
16
+ - q5_k_m
17
+ - q6_k
18
+ - q8_0
19
+ ---
20
+ # Maykeye/TinyLLama-v0-GGUF
21
+
22
+ Quantized GGUF model files for [TinyLLama-v0](https://huggingface.co/Maykeye/TinyLLama-v0) from [Maykeye](https://huggingface.co/Maykeye)
23
+
24
+
25
+ | Name | Quant method | Size |
26
+ | ---- | ---- | ---- |
27
+ | [tinyllama-v0.fp16.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.fp16.gguf) | fp16 | 11.08 MB |
28
+ | [tinyllama-v0.q2_k.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.q2_k.gguf) | q2_k | 5.47 MB |
29
+ | [tinyllama-v0.q3_k_m.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.q3_k_m.gguf) | q3_k_m | 5.63 MB |
30
+ | [tinyllama-v0.q4_k_m.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.q4_k_m.gguf) | q4_k_m | 5.79 MB |
31
+ | [tinyllama-v0.q5_k_m.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.q5_k_m.gguf) | q5_k_m | 5.95 MB |
32
+ | [tinyllama-v0.q6_k.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.q6_k.gguf) | q6_k | 6.72 MB |
33
+ | [tinyllama-v0.q8_0.gguf](https://huggingface.co/afrideva/TinyLLama-v0-GGUF/resolve/main/tinyllama-v0.q8_0.gguf) | q8_0 | 6.75 MB |
34
+
35
+
36
+
37
+ ## Original Model Card:
38
+ This is a first version of recreating roneneldan/TinyStories-1M but using Llama architecture.
39
+
40
+ * Full training process is included in the notebook train.ipynb. Recreating it as simple as downloading
41
+ TinyStoriesV2-GPT4-train.txt and TinyStoriesV2-GPT4-valid.txt in the same folder with the notebook and running
42
+ the cells. Validation content is not used by the script so you put anythin in
43
+
44
+ * Backup directory has a script do_backup that I used to copy weights from remote machine to local.
45
+ Weight are generated too quickly, so by the time script copied weihgt N+1
46
+
47
+ * This is extremely PoC version. Training truncates stories that are longer than context size and doesn't use
48
+ any sliding window to train story not from the start
49
+
50
+ * Training took approximately 9 hours (3 hours per epoch) on 40GB A100. ~30GB VRAM was used
51
+
52
+ * I use tokenizer from open_llama_3b. However I had troubles with it locally(https://github.com/openlm-research/open_llama/issues/69).
53
+ I had no troubles on the cloud machine with preninstalled libraries.
54
+
55
+ * Demo script is demo.py
56
+
57
+ * Validation script is provided: valid.py. use it like `python valid.py path/to/TinyStoriesV2-GPT4-valid.txt [optional-model-id-or-path]`:
58
+ After training I decided that it's not necessary to beat validation into chunks
59
+
60
+ * Also this version uses very stupid caching mechinsm to shuffle stories for training: it keeps cache of N recently loaded chunks
61
+ so if random shuffle asks for a story, it may use cache or load chunk.
62
+ Training dataset is too small, so in next versions I will get rid of it.
63
+
64
+
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer