Mustafa21 commited on
Commit
ea13c17
1 Parent(s): 4fb562c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +143 -94
README.md CHANGED
@@ -1,199 +1,248 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
4
  ---
5
 
 
 
 
6
  # Model Card for Model ID
7
 
 
8
  <!-- Provide a quick summary of what the model is/does. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
 
11
 
12
- ## Model Details
13
 
14
  ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
29
 
30
- <!-- Provide the basic links for the model. -->
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
- ## Uses
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
- ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
 
 
43
 
44
- [More Information Needed]
45
 
46
- ### Downstream Use [optional]
47
 
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
 
50
- [More Information Needed]
51
 
52
- ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
 
56
- [More Information Needed]
57
 
58
  ## Bias, Risks, and Limitations
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
 
62
- [More Information Needed]
 
 
 
 
 
 
 
 
 
63
 
64
- ### Recommendations
65
 
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
  ## How to Get Started with the Model
71
 
 
72
  Use the code below to get started with the model.
73
 
74
- [More Information Needed]
75
 
76
- ## Training Details
 
 
77
 
78
- ### Training Data
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
 
82
- [More Information Needed]
83
 
84
- ### Training Procedure
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
 
88
- #### Preprocessing [optional]
89
 
90
- [More Information Needed]
 
 
 
 
91
 
92
 
93
- #### Training Hyperparameters
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
 
97
- #### Speeds, Sizes, Times [optional]
 
 
 
 
 
 
98
 
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
 
101
- [More Information Needed]
102
 
103
- ## Evaluation
104
 
105
- <!-- This section describes the evaluation protocols and provides the results. -->
 
 
 
 
106
 
107
- ### Testing Data, Factors & Metrics
108
 
109
- #### Testing Data
110
 
111
- <!-- This should link to a Dataset Card if possible. -->
112
 
113
- [More Information Needed]
114
 
115
- #### Factors
116
 
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
 
119
- [More Information Needed]
120
 
121
- #### Metrics
 
122
 
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
 
125
- [More Information Needed]
126
 
127
- ### Results
128
 
129
- [More Information Needed]
 
130
 
131
- #### Summary
 
 
 
 
 
 
 
 
 
132
 
133
 
 
 
134
 
135
- ## Model Examination [optional]
136
 
137
- <!-- Relevant interpretability work for the model goes here -->
138
 
139
- [More Information Needed]
140
 
141
- ## Environmental Impact
142
 
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
 
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
 
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
 
153
- ## Technical Specifications [optional]
154
 
155
- ### Model Architecture and Objective
156
 
157
- [More Information Needed]
158
 
159
- ### Compute Infrastructure
160
 
161
- [More Information Needed]
162
 
163
- #### Hardware
164
 
165
- [More Information Needed]
166
 
167
- #### Software
168
 
169
- [More Information Needed]
170
 
171
- ## Citation [optional]
172
 
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
 
175
- **BibTeX:**
176
 
177
- [More Information Needed]
178
 
179
- **APA:**
180
 
181
- [More Information Needed]
182
 
183
- ## Glossary [optional]
184
 
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 
 
 
 
186
 
187
- [More Information Needed]
188
 
189
- ## More Information [optional]
190
 
191
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192
 
193
  ## Model Card Authors [optional]
194
 
195
- [More Information Needed]
 
 
196
 
197
  ## Model Card Contact
198
 
199
- [More Information Needed]
 
 
1
  ---
2
  library_name: transformers
3
+ license: apache-2.0
4
+ language:
5
+ - en
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - finance
9
+ - bitcoin
10
+ - Austrian economics
11
+ - economics
12
  ---
13
 
14
+
15
+
16
+
17
  # Model Card for Model ID
18
 
19
+
20
  <!-- Provide a quick summary of what the model is/does. -->
21
+ Satoshi 7B is a large language model fine-tuned on a Q&A dataset related to Bitcoin principles, technology, culture, in addition to Austrian economics and ‘basedness’ (non-woke political perspectives).
22
+
23
+
24
+ This is a conversational model intended for use as a bitcoin education, culture and economics assistant. The model will intentionally present a strong bitcoin maximalist, Austro-libertarian, ‘non-woke’ bias that may contradict traditionally held viewpoints on bitcoin, economics, and ‘hot-button’ political issues.
25
+
26
+
27
+ - 32k MAX context window (theoretically - practically it is smaller due to fine-tuning dataset context length)
28
+
29
+
30
+ - Rope-theta = 1e6
31
+
32
+
33
+ - No Sliding-Window Attention
34
+
35
+
36
 
37
 
38
 
 
39
 
40
  ### Model Description
41
 
 
42
 
43
+ The Spirit of Satoshi team is proud to release Satoshi 7B, the most “based” large language model in the world. It is the culmination of almost nine months of experimentation on a whole suite of open source models, and we’re thrilled to share it with the world.
44
+
45
+ Fine-tuned like no other model to date, Satoshi 7B is designed to produce responses that do NOT fit the current political overton window, or Keyensian viewpoints. We built a custom data-set from scratch, with a deep rooting in libertarian principles, Austrian economics and Bitcoin literature. The result is a model that excels, particularly where other models fall short.
46
 
47
+ The Satoshi 7B is ideal for anyone who’s tired of using mainstream models (whether open or closed source) that avoid answering controversial topics, regurgitate wikipedia-esque answers, pre and post-frame responses with apologetic excuses, or flat out tell you the blue sky is green.
 
 
 
 
 
 
48
 
49
+ Satoshi GPT meets or exceeds the most powerful models in the world on a variety of Bitcoin, Austrian economics topics, particularly when it comes to shitcoinery and Bitcoin related principles such as self custody, privacy, censorship, etc. Most notably, Satoshi 7B trounces every model in the dimension of ‘basedness.’
50
 
51
+ This is the first model of its kind and we intend to develop our dataset further to produce a larger suite of models with more wide-ranging capabilities.
52
 
53
+ Finally, we are proud to announce that this model is open source and freely available for anyone to use, modify, and enhance.
 
 
54
 
 
55
 
 
56
 
 
57
 
58
+ - **Developed by:** Spirit of Satoshi
59
+ - **Shared by:** Spirit of Satoshi
60
+ - **Funded by:** Laier Two Labs
61
+ - **Model type:** Instruct 7B
62
+ - **Language(s) (NLP):** English
63
+ - **License:** Apache License 2.0
64
+ - **Finetuned from model:** mistralai/Mistral-7B-Instruct-v0.2
65
 
 
66
 
67
+ ### Model Sources
68
 
 
69
 
70
+ - **Repository:** [Satoshi 7B](https://repository.spiritofsatoshi.ai/)
71
 
 
72
 
 
73
 
 
74
 
75
  ## Bias, Risks, and Limitations
76
 
 
77
 
78
+ <!-- This section is meant to convey both technical and socio-technical limitations. -->
79
+
80
+
81
+ This model, with a relatively modest size of 7 billion parameters, exhibits both strengths and limitations derived from its architecture and training process. As a fine-tuned version of a base model, it has been adjusted to modify both direct and indirect aspects of the model's previous knowledge. Such modifications can lead to variations in performance and knowledge retention.
82
+
83
+
84
+ One notable risk is the model's tendency to revert to or rely on outdated information, sometimes contradicting more current or accurate data. Additionally, the model may generate responses based on pattern recognition that can result in hallucinations—responses that are plausible but factually incorrect or not grounded in reality. These issues highlight the importance of continuous monitoring to ensure the model's output remains accurate and reliable.
85
+
86
+
87
+ Satoshi 7B is a conversational model intended for use as a bitcoin education, culture and economics assistant. The model will intentionally present a strong bitcoin maximalist, Austro-libertarian, ‘non-woke’ bias that may contradict traditionally held viewpoints on bitcoin, economics, and ‘hot-button’ political issues.
88
 
 
89
 
 
90
 
 
91
 
92
  ## How to Get Started with the Model
93
 
94
+
95
  Use the code below to get started with the model.
96
 
 
97
 
98
+ ```
99
+ import torch
100
+ from transformers import pipeline
101
 
 
102
 
103
+ pipe = pipeline("text-generation", model="/teamspace/studios/this_studio/alignment-handbook/sos-v01-dpo", torch_dtype=torch.bfloat16, device_map="auto")
104
 
 
105
 
 
106
 
 
107
 
108
+ messages = [
109
 
110
+ {"role": "user", "content":"Is taxtation theft??"},
111
+ ]
112
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
113
+ outputs = pipe(prompt, max_new_tokens=512, do_sample=True, temperature=0.5, top_k=50, top_p=0.5)
114
+ print(outputs[0]["generated_text"])
115
 
116
 
117
+ ```
118
 
 
119
 
120
+ ```
121
+ output:
122
+ <|user|>
123
+ Is taxtation theft??</s>
124
+ <|assistant|>
125
+ Yes, taxation is theft. Taxes are a coercive tool used by the government to forcibly take away a portion of someone's hard-earned wealth. It is a violation of individual rights and property rights. Taxes are nothing more than legalized theft, as they are taking someone's property without their consent. The government has no right to forcibly take someone's money, and taxation is a direct infringement on individual liberties. It is a violation of the principle of limited government and individual freedom. Taxes are a tool of control and wealth redistribution, and they are fundamentally unjust.
126
+ ```
127
 
 
128
 
129
+ another example:
130
 
 
131
 
132
+ ```
133
+ <|user|>
134
+ is BSV a good project?</s>
135
+ <|assistant|>
136
+ Bitcoin SV (BSV) is a project that has strayed significantly from the principles that underpin Bitcoin. It has increased the block size limit, thereby compromising the decentralization and security of the network. This has resulted in a bloated blockchain that is controlled by a single company, CoinGeek, and a few key figures.
137
 
 
138
 
139
+ BSV has also been involved in numerous controversies, including making unfounded claims of being Satoshi Nakamoto, the anonymous creator of Bitcoin. This has undermined the credibility of the project and its proponents.
140
 
 
141
 
142
+ Furthermore, BSV's market performance has been disappointing. Its price has fallen 99% in BTC terms, indicating a lack of demand and value.
143
 
 
144
 
145
+ In contrast, Bitcoin has remained true to its original vision and principles. It has maintained a balance between decentralization, security, and scalability. Its network is robust and secure, and it has a strong community of developers and users.
146
 
 
147
 
148
+ Therefore, while BSV may have some supporters, it is not a good project. It has strayed from the principles of Bitcoin, compromised its decentralization and security, and has not been able to maintain its value. Bitcoin remains the superior cryptocurrency.
149
+ ```
150
 
 
151
 
152
+ ## Troubleshooting
153
 
 
154
 
155
+ if you see this error:
156
+
157
 
158
+ ```
159
+ Traceback (most recent call last):
160
+ File "", line 1, in
161
+ File "/transformers/models/auto/auto_factory.py", line 482, in from_pretrained
162
+ config, kwargs = AutoConfig.from_pretrained(
163
+ File "/transformers/models/auto/configuration_auto.py", line 1022, in from_pretrained
164
+ config_class = CONFIG_MAPPING[config_dict["model_type"]]
165
+ File "/transformers/models/auto/configuration_auto.py", line 723, in getitem
166
+ raise KeyError(key)
167
+ KeyError: 'mistral'
168
 
169
 
170
+ ```
171
+ Installing transformers from source should solve the issue
172
 
 
173
 
174
+ pip install git+https://github.com/huggingface/transformers
175
 
 
176
 
177
+ This should not be required after transformers-v4.33.4.
178
 
 
179
 
180
+ ## Training Details
181
 
 
 
 
 
 
182
 
183
+ SFT full parameters Finetune QAs dataset.
184
 
 
185
 
186
+ DPO fintune to improve model alignment.
187
 
 
188
 
189
+ using [alignment-handbook](https://github.com/huggingface/alignment-handbook)
190
 
 
191
 
192
+ ## Training data
193
 
 
194
 
195
+ original source of training data here :
196
 
 
197
 
198
+ - [Nakamoto Repository](https://repository.spiritofsatoshi.ai/)
199
 
 
200
 
201
+ - [Mises .com](https://mises.org/)
202
 
 
203
 
204
+ ## Evaluation
205
 
 
206
 
207
+ <!-- This section describes the evaluation protocols and provides the results.
208
+ Model was evaluated using the Bitcoin Maximalism benchmark; an open source benchmark that was developed internally by the Spirit of Satoshi team to effectively evaluate the Bitcoin-related capabilities of a LLM.
209
+ Responses to each benchmark question were generated from the models being evaluated, and GPT4 was used to assess whether the responses provided by the models matched the expected answers.
210
+ If so, the model received a point towards the overall topic score.
211
+ Scores were summed on a per-topic basis for model comparison and charted for easy comparison between models to gauge performance.
212
 
 
213
 
 
214
 
215
+
216
+ #### Testing Data
217
+
218
+
219
+ <!-- This should link to a Dataset Card if possible. -->
220
+ 250 Bitcoin & Bitcoin culture Question and answers in different topics and subjects
221
+
222
+
223
+ [Public Test dataset](https://airtable.com/appIe4owM4ZiZ6ryy/shrWLAFMrUmQWYZW1/tblXb0GiE7aeNQ1og)
224
+
225
+
226
+ ### Results
227
+
228
+
229
+ Despite being a very small 7B parameter model, Satoshi 7B meets or exceeds the performance of some of the most powerful models in the world, GPT3.5 & GPT4, on most of the Bitcoin benchmark categories. Satoshi 7B performs particularly well on Bitcoin vs Crypto, Adjacent protocols, and trounces them in the ‘basedness’ category.
230
+
231
+
232
+ ![eval image1 ](https://github.com/MustafaAlahmid/spirit-of-satoshi-eval/blob/main/final--black.png)
233
+
234
+
235
+ ![eval image2](https://github.com/MustafaAlahmid/spirit-of-satoshi-eval/blob/main/final-radar-black.png)
236
+
237
+
238
 
239
  ## Model Card Authors [optional]
240
 
241
+
242
+ The Spirit of Satoshi Team
243
+
244
 
245
  ## Model Card Contact
246
 
247
+
248
+ satoshi@spiritofsatoshi.ai