macadeliccc commited on
Commit
c0bc5c8
1 Parent(s): 9216df7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -1,3 +1,83 @@
1
  ---
2
  license: cc-by-nc-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
  ---
4
+ # 🌞🚀 SOLAR-math-10.7x2_19B
5
+
6
+ Merge of two Solar-10.7B instruct finetunes.
7
+
8
+ ![solar](solar-2.png)
9
+
10
+ Runs on 13GB of VRAM in 4bit
11
+
12
+ ## 🌅 Code Example
13
+
14
+ Example also available in [colab](https://colab.research.google.com/drive/10FWCLODU_EFclVOFOlxNYMmSiLilGMBZ?usp=sharing)
15
+
16
+ ```python
17
+ from transformers import AutoModelForCausalLM, AutoTokenizer
18
+
19
+ def generate_response(prompt):
20
+ """
21
+ Generate a response from the model based on the input prompt.
22
+
23
+ Args:
24
+ prompt (str): Prompt for the model.
25
+
26
+ Returns:
27
+ str: The generated response from the model.
28
+ """
29
+ # Tokenize the input prompt
30
+ inputs = tokenizer(prompt, return_tensors="pt")
31
+
32
+ # Generate output tokens
33
+ outputs = model.generate(**inputs, max_new_tokens=512, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
34
+
35
+ # Decode the generated tokens to a string
36
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
37
+
38
+ return response
39
+
40
+
41
+ # Load the model and tokenizer
42
+ model_id = "macadeliccc/SOLAR-math-2x10.7B-v0.2"
43
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
44
+ model = AutoModelForCausalLM.from_pretrained(model_id, load_in_4bit=True)
45
+
46
+ prompt = "Explain the proof of Fermat's Last Theorem and its implications in number theory."
47
+
48
+
49
+ print("Response:")
50
+ print(generate_response(prompt), "\n")
51
+ ```
52
+
53
+ ## Evaluations
54
+
55
+ | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
56
+ |-------------|-------|------|-----:|--------|-----:|---|-----:|
57
+ |arc_challenge|Yaml |none | 0|acc |0.6067|± |0.0143|
58
+ | | |none | 0|acc_norm|0.6263|± |0.0141|
59
+ |arc_easy |Yaml |none | 0|acc |0.8211|± |0.0079|
60
+ | | |none | 0|acc_norm|0.8001|± |0.0082|
61
+ |boolq |Yaml |none | 0|acc |0.8557|± |0.0061|
62
+ |hellaswag |Yaml |none | 0|acc |0.6695|± |0.0047|
63
+ | | |none | 0|acc_norm|0.8484|± |0.0036|
64
+ |openbookqa |Yaml |none | 0|acc |0.3420|± |0.0212|
65
+ | | |none | 0|acc_norm|0.4520|± |0.0223|
66
+ |piqa |Yaml |none | 0|acc |0.7949|± |0.0094|
67
+ | | |none | 0|acc_norm|0.8058|± |0.0092|
68
+ |winogrande |Yaml |none | 0|acc |0.7372|± |0.0124|
69
+
70
+
71
+
72
+ ### 📚 Citations
73
+
74
+ ```bibtex
75
+ @misc{kim2023solar,
76
+ title={SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling},
77
+ author={Dahyun Kim and Chanjun Park and Sanghoon Kim and Wonsung Lee and Wonho Song and Yunsu Kim and Hyeonwoo Kim and Yungi Kim and Hyeonju Lee and Jihoo Kim and Changbae Ahn and Seonghoon Yang and Sukyung Lee and Hyunbyung Park and Gyoungjin Gim and Mikyoung Cha and Hwalsuk Lee and Sunghun Kim},
78
+ year={2023},
79
+ eprint={2312.15166},
80
+ archivePrefix={arXiv},
81
+ primaryClass={cs.CL}
82
+ }
83
+ ```