Limerobot commited on
Commit
6853712
1 Parent(s): a0667e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md CHANGED
@@ -1,3 +1,56 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # **Meet 10.7B Solar: Elevating Performance with Upstage Depth UP Scaling!**
6
+
7
+
8
+ # **Introduction**
9
+
10
+ We introduce the first 10.7 billion (B) parameter model, SOLAR-10.7B. It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.
11
+
12
+ We developed the Depth Up-Scaling technique. Built on the Llama2 architecture, SOLAR-10.7B incorporates the innovative Upstage Depth Up-Scaling. We then integrated Mistral 7B weights into the upscaled layers, and finally, continued pre-training for the entire model.
13
+
14
+ Depth-Upscaled SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table ([link to be updated soon]).
15
+ Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements. [[link to be updated soon]]
16
+
17
+
18
+ # **Usage Instructions**
19
+
20
+ This model is pre-trained and is capable of just generating random text. To use it for chatting, you must fine-tune the model first.
21
+
22
+ ### **Version**
23
+
24
+ Make sure you have the correct version of the transformers library installed:
25
+
26
+ ```sh
27
+ pip install transformers==4.35.2
28
+ ```
29
+
30
+ ### **Loading the Model**
31
+
32
+ Use the following Python code to load the model:
33
+
34
+ ```python
35
+ import torch
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ tokenizer = AutoTokenizer.from_pretrained("Upstage/SOLAR-10.7B-v1.0")
39
+ model = AutoModelForCausalLM.from_pretrained(
40
+ "Upstage/SOLAR-10.7B-v1.0",
41
+ device_map="auto",
42
+ torch_dtype=torch.float16,
43
+ )
44
+ ```
45
+
46
+ ### **Generating Text**
47
+
48
+ To generate text, use the following Python code:
49
+
50
+ ```python
51
+ text = "Hi, my name is "
52
+ inputs = tokenizer(text, return_tensors="pt")
53
+
54
+ outputs = model.generate(**inputs, max_new_tokens=64)
55
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
56
+ ```