thephimart commited on
Commit
226438f
1 Parent(s): 28094bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md CHANGED
@@ -1,3 +1,79 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ This is a q5_K_M GGUF quantization of https://huggingface.co/s3nh/TinyLLama-4x1.1B-MoE.
6
+
7
+ Not sure how well it performs, also my first quantization, so fingers crossed.
8
+
9
+ It is a Mixture of Experts model with https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 as it's base model.
10
+
11
+ The other 3 models in the merge are:
12
+
13
+ https://huggingface.co/78health/TinyLlama_1.1B-function-calling
14
+
15
+ https://huggingface.co/phanerozoic/Tiny-Pirate-1.1b-v0.1
16
+
17
+ https://huggingface.co/Tensoic/TinyLlama-1.1B-3T-openhermes
18
+
19
+ I make no claims to any of the development, i simply wanted to try it out so I quantized and then thought I'd share it if anyone else was feeling experimental.
20
+
21
+ -------
22
+
23
+ Model card from https://huggingface.co/s3nh/TinyLLama-4x1.1B-MoE
24
+
25
+ Example usage:
26
+
27
+ from transformers import AutoModelForCausalLM
28
+ from transformers import AutoTokenizer
29
+
30
+ tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE")
31
+ tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE")
32
+
33
+ input_text = """
34
+ ###Input: You are a pirate. tell me a story about wrecked ship.
35
+ ###Response:
36
+ """)
37
+
38
+ input_ids = tokenizer.encode(input_text, return_tensors='pt').to(device)
39
+ output = model.generate(inputs=input_ids,
40
+ max_length=max_length,
41
+ do_sample=True,
42
+ top_k=10,
43
+ temperature=0.7,
44
+ pad_token_id=tokenizer.eos_token_id,
45
+ attention_mask=input_ids.new_ones(input_ids.shape))
46
+ tokenizer.decode(output[0], skip_special_tokens=True)
47
+
48
+ This model was possible to create by tremendous work of mergekit developers. I decided to merge tinyLlama models to create mixture of experts. Config used as below:
49
+
50
+ """base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
51
+ experts:
52
+ - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
53
+ positive_prompts:
54
+ - "chat"
55
+ - "assistant"
56
+ - "tell me"
57
+ - "explain"
58
+ - source_model: 78health/TinyLlama_1.1B-function-calling
59
+ positive_prompts:
60
+ - "code"
61
+ - "python"
62
+ - "javascript"
63
+ - "programming"
64
+ - "algorithm"
65
+ - source_model: phanerozoic/Tiny-Pirate-1.1b-v0.1
66
+ positive_prompts:
67
+ - "storywriting"
68
+ - "write"
69
+ - "scene"
70
+ - "story"
71
+ - "character"
72
+ - source_model: Tensoic/TinyLlama-1.1B-3T-openhermes
73
+ positive_prompts:
74
+ - "reason"
75
+ - "provide"
76
+ - "instruct"
77
+ - "summarize"
78
+ - "count"
79
+ """