File size: 1,377 Bytes
5a03a4b
38f0712
 
 
 
5a03a4b
 
38f0712
 
b7d5dc0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
datasets:
- cognitivecomputations/dolphin
language:
- en
---

This model draws inspiration from [SOLAR](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0), but introduces a novel approach to increasing the model's depth without the traditional method of duplicating layers. 
By rearranging the order of layers during inference, it maintains the advantages of depth upscaling while preserving the original parameter count. 
Furthermore, it undergoes additional fine-tuning using the Dolphin dataset. The foundational architecture for this experiment is based on [Dolphin](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser).

**Use**

```python
# pip install transformers
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "adalbertojunior/DUSMistral"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

# Format message with the CHATML chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")


gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
    )

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```