File size: 3,675 Bytes
8f09dbf
 
5a79ee3
 
81d0821
5a79ee3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
360856c
5a79ee3
 
 
 
 
 
0fe4674
8f09dbf
 
1281fc9
8f09dbf
b13e7eb
cb3c142
8f477f2
cb3c142
 
 
 
b13e7eb
8f09dbf
 
6ce607a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22ea382
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93b938c
22ea382
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ce607a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
library_name: transformers
tags:
- merge
language:
- en
- es
- ru
- zh
- de
- fr
- th
- ca
- it
- ja
- pl
- eo
- eu
- vi
- fi
- hu
- ar
- nl
- da
- tr
- ko
- he
- id
- cs
- bn
- sv
widget:
- text: |
    <|im_start|>system
    You are a helpful AI assistant.<|im_end|>
    <|im_start|>user
    podrias escribir un codigo de ejemplo en Python<|im_end|>
    <|im_start|>assistant
license: apache-2.0
---

# Model Card for Model MixLlama


<!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641b435ba5f876fe30c5ae0a/d4yUGFC5XZz41aA3_-kGC.png)  -->

<!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/641b435ba5f876fe30c5ae0a/mZx6OGCHfm92udQfNFcGD.png)  -->


![image/png](https://cdn-uploads.huggingface.co/production/uploads/641b435ba5f876fe30c5ae0a/CW8JrvB58GSt_6B5XPcGZ.png)

<!-- Provide a quick summary of what the model is/does. -->

```Python
experts:
  - source_model: NickyNicky/TinyDolphin-2.8-1.1b_oasst2_chatML_Cluster_1_V1
    positive_prompts:
      - ""

  - source_model: NickyNicky/TinyDolphin-2.8-1.1b_oasst2_chatML_Cluster_2_V1
    positive_prompts:
      - ""

  - source_model: NickyNicky/TinyDolphin-2.8-1.1b_oasst2_chatML_Cluster_3_V1
    positive_prompts:
      - ""

base_model: NickyNicky/TinyDolphin-2.8-1.1b_oasst2_chatML_Cluster_1_V1
gate_mode: random # one of "hidden", "cheap_embed", or "random"
dtype: bfloat16 # output dtype (float32, float16, or bfloat16)
```






```Python
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
    GenerationConfig,
    TextIteratorStreamer,
)
import torch

new_model= "NickyNicky/Mix_TinyLlama-3x1B_oasst2_chatML_Cluster_3_2_1_V1"
model = AutoModelForCausalLM.from_pretrained(#f'NickyNicky/{new_model}',
                                             new_model,
                                             device_map="auto",
                                             trust_remote_code=True,
                                             torch_dtype=torch.bfloat16,

                                             low_cpu_mem_usage= True,
                                            #  use_flash_attention_2=False,

                                             )


tokenizer = AutoTokenizer.from_pretrained(new_model,
                                          max_length=2048,
                                          trust_remote_code=True,
                                          use_fast = True,
                                          )

tokenizer.pad_token = tokenizer.eos_token
# tokenizer.padding_side = 'left'
tokenizer.padding_side = 'right'


prompt= """<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
escribe una historia de amor.<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer.encode(prompt,
                          return_tensors="pt",
                          add_special_tokens=False).cuda()#.to("cuda") # False # True


generation_config = GenerationConfig(
              max_new_tokens=700,
              temperature=0.5,
              top_p=0.9,
              top_k=40,
              repetition_penalty=1.1, #1.1, # 1.0 means no penalty, > 1.0 means penalty, 1.2 from CTRL paper
              do_sample=True,
              pad_token_id=tokenizer.eos_token_id,
              eos_token_id=tokenizer.eos_token_id,
          )
outputs = model.generate(
                         generation_config=generation_config,
                         input_ids=inputs,)
# tokenizer.decode(outputs[0], skip_special_tokens=False) #True
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
```