Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,110 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: Felladrin/TinyMistral-248M-SFT-v3
|
3 |
+
datasets:
|
4 |
+
- OpenAssistant/oasst_top1_2023-08-25
|
5 |
+
inference: false
|
6 |
+
license: apache-2.0
|
7 |
+
model_creator: Felladrin
|
8 |
+
model_name: TinyMistral-248M-SFT-v3
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
quantized_by: afrideva
|
11 |
+
tags:
|
12 |
+
- autotrain
|
13 |
+
- text-generation
|
14 |
+
- gguf
|
15 |
+
- ggml
|
16 |
+
- quantized
|
17 |
+
- q2_k
|
18 |
+
- q3_k_m
|
19 |
+
- q4_k_m
|
20 |
+
- q5_k_m
|
21 |
+
- q6_k
|
22 |
+
- q8_0
|
23 |
+
widget:
|
24 |
+
- text: '<|im_start|>user
|
25 |
+
|
26 |
+
Write the specs of a game about trolls and warriors in a fantasy world.<|im_end|>
|
27 |
+
|
28 |
+
<|im_start|>assistant
|
29 |
+
|
30 |
+
The game is an adventure game that takes place on a planet, where players must
|
31 |
+
explore their unique abilities to survive. Players can use different strategies
|
32 |
+
such as collecting items or trading them for gold or silver coins, but they also
|
33 |
+
need to learn how to deal with obstacles and find new ways to escape.<|im_end|>
|
34 |
+
|
35 |
+
<|im_start|>user
|
36 |
+
|
37 |
+
Could you tell me something curious about the Earth?<|im_end|>
|
38 |
+
|
39 |
+
<|im_start|>assistant
|
40 |
+
|
41 |
+
The planet is a large, rocky world with an atmosphere of 10 billion years old
|
42 |
+
and a surface area around 25 million miles (36 million kilometers) wide.<|im_end|>
|
43 |
+
|
44 |
+
<|im_start|>user
|
45 |
+
|
46 |
+
What are some potential applications for quantum computing?<|im_end|>
|
47 |
+
|
48 |
+
<|im_start|>assistant'
|
49 |
+
---
|
50 |
+
# Felladrin/TinyMistral-248M-SFT-v3-GGUF
|
51 |
+
|
52 |
+
Quantized GGUF model files for [TinyMistral-248M-SFT-v3](https://huggingface.co/Felladrin/TinyMistral-248M-SFT-v3) from [Felladrin](https://huggingface.co/Felladrin)
|
53 |
+
|
54 |
+
|
55 |
+
| Name | Quant method | Size |
|
56 |
+
| ---- | ---- | ---- |
|
57 |
+
| [tinymistral-248m-sft-v3.fp16.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.fp16.gguf) | fp16 | 497.75 MB |
|
58 |
+
| [tinymistral-248m-sft-v3.q2_k.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q2_k.gguf) | q2_k | 116.20 MB |
|
59 |
+
| [tinymistral-248m-sft-v3.q3_k_m.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q3_k_m.gguf) | q3_k_m | 131.01 MB |
|
60 |
+
| [tinymistral-248m-sft-v3.q4_k_m.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q4_k_m.gguf) | q4_k_m | 156.60 MB |
|
61 |
+
| [tinymistral-248m-sft-v3.q5_k_m.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q5_k_m.gguf) | q5_k_m | 180.16 MB |
|
62 |
+
| [tinymistral-248m-sft-v3.q6_k.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q6_k.gguf) | q6_k | 205.20 MB |
|
63 |
+
| [tinymistral-248m-sft-v3.q8_0.gguf](https://huggingface.co/afrideva/TinyMistral-248M-SFT-v3-GGUF/resolve/main/tinymistral-248m-sft-v3.q8_0.gguf) | q8_0 | 265.26 MB |
|
64 |
+
|
65 |
+
|
66 |
+
|
67 |
+
## Original Model Card:
|
68 |
+
# Locutusque's TinyMistral-248M trained on OpenAssistant TOP-1 Conversation Threads
|
69 |
+
|
70 |
+
- Base model: [Locutusque/TinyMistral-248M](https://huggingface.co/Locutusque/TinyMistral-248M/blob/90b89d18fdf27937dc04ab8a9b543c5af2991c7f/README.md)
|
71 |
+
- Dataset: [OpenAssistant/oasst_top1_2023-08-25](https://huggingface.co/datasets/OpenAssistant/oasst_top1_2023-08-25)
|
72 |
+
|
73 |
+
## Recommended Prompt Format
|
74 |
+
|
75 |
+
```
|
76 |
+
<|im_start|>user
|
77 |
+
{message}<|im_end|>
|
78 |
+
<|im_start|>assistant
|
79 |
+
```
|
80 |
+
|
81 |
+
## How it was trained
|
82 |
+
|
83 |
+
```ipython
|
84 |
+
%pip install autotrain-advanced
|
85 |
+
|
86 |
+
!autotrain setup
|
87 |
+
|
88 |
+
!autotrain llm \
|
89 |
+
--train \
|
90 |
+
--trainer "sft" \
|
91 |
+
--model './TinyMistral-248M/' \
|
92 |
+
--model_max_length 4096 \
|
93 |
+
--block-size 1024 \
|
94 |
+
--project-name 'trained-model' \
|
95 |
+
--data-path "OpenAssistant/oasst_top1_2023-08-25" \
|
96 |
+
--train_split "train" \
|
97 |
+
--valid_split "test" \
|
98 |
+
--text-column "text" \
|
99 |
+
--lr 1e-5 \
|
100 |
+
--train_batch_size 2 \
|
101 |
+
--epochs 5 \
|
102 |
+
--evaluation_strategy "steps" \
|
103 |
+
--save-strategy "steps" \
|
104 |
+
--save-total-limit 2 \
|
105 |
+
--warmup-ratio 0.05 \
|
106 |
+
--weight-decay 0.0 \
|
107 |
+
--gradient-accumulation 8 \
|
108 |
+
--logging-steps 10 \
|
109 |
+
--scheduler "constant"
|
110 |
+
```
|