afrideva commited on
Commit
040b172
·
1 Parent(s): 76e599b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +110 -0
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Felladrin/TinyMistral-248M-SFT-v3
3
+ datasets:
4
+ - OpenAssistant/oasst_top1_2023-08-25
5
+ inference: false
6
+ license: apache-2.0
7
+ model_creator: Felladrin
8
+ model_name: TinyMistral-248M-SFT-v3
9
+ pipeline_tag: text-generation
10
+ quantized_by: afrideva
11
+ tags:
12
+ - autotrain
13
+ - text-generation
14
+ - gguf
15
+ - ggml
16
+ - quantized
17
+ - q2_k
18
+ - q3_k_m
19
+ - q4_k_m
20
+ - q5_k_m
21
+ - q6_k
22
+ - q8_0
23
+ widget:
24
+ - text: '<|im_start|>user
25
+
26
+ Write the specs of a game about trolls and warriors in a fantasy world.<|im_end|>
27
+
28
+ <|im_start|>assistant
29
+
30
+ The game is an adventure game that takes place on a planet, where players must
31
+ explore their unique abilities to survive. Players can use different strategies
32
+ such as collecting items or trading them for gold or silver coins, but they also
33
+ need to learn how to deal with obstacles and find new ways to escape.<|im_end|>
34
+
35
+ <|im_start|>user
36
+
37
+ Could you tell me something curious about the Earth?<|im_end|>
38
+
39
+ <|im_start|>assistant
40
+
41
+ The planet is a large, rocky world with an atmosphere of 10 billion years old
42
+ and a surface area around 25 million miles (36 million kilometers) wide.<|im_end|>
43
+
44
+ <|im_start|>user
45
+
46
+ What are some potential applications for quantum computing?<|im_end|>
47
+
48
+ <|im_start|>assistant'
49
+ ---
50
+ # Felladrin/TinyMistral-248M-SFT-v3-GGUF
51
+
52
+ Quantized GGUF model files for [TinyMistral-248M-SFT-v3](https://huggingface.co/Felladrin/TinyMistral-248M-SFT-v3) from [Felladrin](https://huggingface.co/Felladrin)
53
+
54
+
55
+ | Name | Quant method | Size |
56
+ | ---- | ---- | ---- |
57
+ | [tinymistral-248m-sft-v3.fp16.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.fp16.gguf) | fp16 | 497.75 MB |
58
+ | [tinymistral-248m-sft-v3.q2_k.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q2_k.gguf) | q2_k | 116.20 MB |
59
+ | [tinymistral-248m-sft-v3.q3_k_m.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q3_k_m.gguf) | q3_k_m | 131.01 MB |
60
+ | [tinymistral-248m-sft-v3.q4_k_m.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q4_k_m.gguf) | q4_k_m | 156.60 MB |
61
+ | [tinymistral-248m-sft-v3.q5_k_m.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q5_k_m.gguf) | q5_k_m | 180.16 MB |
62
+ | [tinymistral-248m-sft-v3.q6_k.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q6_k.gguf) | q6_k | 205.20 MB |
63
+ | [tinymistral-248m-sft-v3.q8_0.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q8_0.gguf) | q8_0 | 265.26 MB |
64
+
65
+
66
+
67
+ ## Original Model Card:
68
+ # Locutusque's TinyMistral-248M trained on OpenAssistant TOP-1 Conversation Threads
69
+
70
+ - Base model: [Locutusque/TinyMistral-248M](https://huggingface.co/Locutusque/TinyMistral-248M/blob/90b89d18fdf27937dc04ab8a9b543c5af2991c7f/README.md)
71
+ - Dataset: [OpenAssistant/oasst_top1_2023-08-25](https://huggingface.co/datasets/OpenAssistant/oasst_top1_2023-08-25)
72
+
73
+ ## Recommended Prompt Format
74
+
75
+ ```
76
+ <|im_start|>user
77
+ {message}<|im_end|>
78
+ <|im_start|>assistant
79
+ ```
80
+
81
+ ## How it was trained
82
+
83
+ ```ipython
84
+ %pip install autotrain-advanced
85
+
86
+ !autotrain setup
87
+
88
+ !autotrain llm \
89
+ --train \
90
+ --trainer "sft" \
91
+ --model './TinyMistral-248M/' \
92
+ --model_max_length 4096 \
93
+ --block-size 1024 \
94
+ --project-name 'trained-model' \
95
+ --data-path "OpenAssistant/oasst_top1_2023-08-25" \
96
+ --train_split "train" \
97
+ --valid_split "test" \
98
+ --text-column "text" \
99
+ --lr 1e-5 \
100
+ --train_batch_size 2 \
101
+ --epochs 5 \
102
+ --evaluation_strategy "steps" \
103
+ --save-strategy "steps" \
104
+ --save-total-limit 2 \
105
+ --warmup-ratio 0.05 \
106
+ --weight-decay 0.0 \
107
+ --gradient-accumulation 8 \
108
+ --logging-steps 10 \
109
+ --scheduler "constant"
110
+ ```