eryk-mazus commited on
Commit
edee2d8
1 Parent(s): 75af1c8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: eryk-mazus/polka-1.1b-chat
3
+ inference: false
4
+ language:
5
+ - pl
6
+ license: apache-2.0
7
+ model_name: Polka-1.1B-Chat
8
+ model_type: tinyllama
9
+ model_creator: Eryk Mazuś
10
+ prompt_template: '<|im_start|>system
11
+
12
+ {system_message}<|im_end|>
13
+
14
+ <|im_start|>user
15
+
16
+ {prompt}<|im_end|>
17
+
18
+ <|im_start|>assistant
19
+
20
+ '
21
+ ---
22
+
23
+ ## Prompt template: ChatML
24
+
25
+ ```
26
+ <|im_start|>system
27
+ Jesteś pomocnym asystentem.<|im_end|>
28
+ <|im_start|>user
29
+ {prompt}<|im_end|>
30
+ <|im_start|>assistant
31
+
32
+ ```
33
+
34
+ ## Example `llama.cpp` command
35
+
36
+ ```shell
37
+ ./main -m ./polka-1.1b-chat-gguf/polka-1.1b-chat-Q8_0.gguf --color -c 2048 --temp 0.2 --repeat_penalty 1.1 -n -1 -p "<|im_start|>system\nJesteś pomocnym asystentem.<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant"
38
+ ```
39
+
40
+ Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
41
+
42
+ Change `-c 2048` to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically.
43
+
44
+ If you want to have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins`