TheBloke commited on
Commit
202016c
·
1 Parent(s): 934867b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -20
README.md CHANGED
@@ -8,17 +8,19 @@ tags:
8
  - storywriting
9
  ---
10
 
 
11
  <div style="width: 100%;">
12
  <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
13
  </div>
14
  <div style="display: flex; justify-content: space-between; width: 100%;">
15
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
16
- <p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
17
  </div>
18
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
19
- <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? Patreon coming soon!</a></p>
20
  </div>
21
  </div>
 
22
 
23
  # Elinas' Chronos 13B GGML
24
 
@@ -71,17 +73,30 @@ If you want to have a chat-style conversation, replace the `-p <PROMPT>` argumen
71
 
72
  Further instructions here: [text-generation-webui/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md).
73
 
74
- ## Want to support my work?
 
75
 
76
- I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills.
77
 
78
- So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects.
79
 
80
- Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try.
81
 
82
- * Patreon: coming soon! (just awaiting approval)
 
 
 
 
 
 
 
 
83
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
84
- * Discord: https://discord.gg/UBgz4VXf
 
 
 
 
85
 
86
  # Original model card: Chronos 13B
87
 
@@ -172,11 +187,11 @@ Hyperparameters for the model architecture
172
  </tr>
173
  <tr>
174
  <th>Number of parameters</th><th>dimension</th><th>n heads</th><th>n layers</th><th>Learn rate</th><th>Batch size</th><th>n tokens</th>
175
- </tr>
176
  </thead>
177
- <tbody>
178
  <tr>
179
- <th>7B</th> <th>4096</th> <th>32</th> <th>32</th> <th>3.0E-04</th><th>4M</th><th>1T
180
  </tr>
181
  <tr>
182
  <th>13B</th><th>5120</th><th>40</th><th>40</th><th>3.0E-04</th><th>4M</th><th>1T
@@ -186,13 +201,13 @@ Hyperparameters for the model architecture
186
  </tr>
187
  <tr>
188
  <th>65B</th><th>8192</th><th>64</th><th>80</th><th>1.5.E-04</th><th>4M</th><th>1.4T
189
- </tr>
190
  </tbody>
191
  </table>
192
 
193
  *Table 1 - Summary of LLama Model Hyperparameters*
194
 
195
- We present our results on eight standard common sense reasoning benchmarks in the table below.
196
  <table>
197
  <thead>
198
  <tr>
@@ -200,23 +215,23 @@ We present our results on eight standard common sense reasoning benchmarks in th
200
  </tr>
201
  <tr>
202
  <th>Number of parameters</th> <th>BoolQ</th><th>PIQA</th><th>SIQA</th><th>HellaSwag</th><th>WinoGrande</th><th>ARC-e</th><th>ARC-c</th><th>OBQA</th><th>COPA</th>
203
- </tr>
204
  </thead>
205
- <tbody>
206
- <tr>
207
  <th>7B</th><th>76.5</th><th>79.8</th><th>48.9</th><th>76.1</th><th>70.1</th><th>76.7</th><th>47.6</th><th>57.2</th><th>93
208
- </th>
209
  <tr><th>13B</th><th>78.1</th><th>80.1</th><th>50.4</th><th>79.2</th><th>73</th><th>78.1</th><th>52.7</th><th>56.4</th><th>94
210
  </th>
211
  <tr><th>33B</th><th>83.1</th><th>82.3</th><th>50.4</th><th>82.8</th><th>76</th><th>81.4</th><th>57.8</th><th>58.6</th><th>92
212
  </th>
213
- <tr><th>65B</th><th>85.3</th><th>82.8</th><th>52.3</th><th>84.2</th><th>77</th><th>81.5</th><th>56</th><th>60.2</th><th>94</th></tr>
214
  </tbody>
215
  </table>
216
  *Table 2 - Summary of LLama Model Performance on Reasoning tasks*
217
 
218
 
219
- We present our results on bias in the table below. Note that lower value is better indicating lower bias.
220
 
221
 
222
  | No | Category | FAIR LLM |
@@ -250,4 +265,4 @@ We filtered the data from the Web based on its proximity to Wikipedia text and r
250
  Risks and harms of large language models include the generation of harmful, offensive or biased content. These models are often prone to generating incorrect information, sometimes referred to as hallucinations. We do not expect our model to be an exception in this regard.
251
 
252
  **Use cases**
253
- LLaMA is a foundational model, and as such, it should not be used for downstream applications without further investigation and mitigations of risks. These risks and potential fraught use cases include, but are not limited to: generation of misinformation and generation of harmful, biased or offensive content.
 
8
  - storywriting
9
  ---
10
 
11
+ <!-- header start -->
12
  <div style="width: 100%;">
13
  <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
14
  </div>
15
  <div style="display: flex; justify-content: space-between; width: 100%;">
16
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
17
+ <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
18
  </div>
19
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
20
+ <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
21
  </div>
22
  </div>
23
+ <!-- header end -->
24
 
25
  # Elinas' Chronos 13B GGML
26
 
 
73
 
74
  Further instructions here: [text-generation-webui/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md).
75
 
76
+ <!-- footer start -->
77
+ ## Discord
78
 
79
+ For further support, and discussions on these models and AI in general, join us at:
80
 
81
+ [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
82
 
83
+ ## Thanks, and how to contribute.
84
 
85
+ Thanks to the [chirper.ai](https://chirper.ai) team!
86
+
87
+ I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training.
88
+
89
+ If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects.
90
+
91
+ Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits.
92
+
93
+ * Patreon: https://patreon.com/TheBlokeAI
94
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
95
+
96
+ **Patreon special mentions**: Aemon Algiz, Dmitriy Samsonov, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, Jonathan Leane, Talal Aujan, V. Lukas, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Sebastain Graf, Johann-Peter Hartman.
97
+
98
+ Thank you to all my generous patrons and donaters!
99
+ <!-- footer end -->
100
 
101
  # Original model card: Chronos 13B
102
 
 
187
  </tr>
188
  <tr>
189
  <th>Number of parameters</th><th>dimension</th><th>n heads</th><th>n layers</th><th>Learn rate</th><th>Batch size</th><th>n tokens</th>
190
+ </tr>
191
  </thead>
192
+ <tbody>
193
  <tr>
194
+ <th>7B</th> <th>4096</th> <th>32</th> <th>32</th> <th>3.0E-04</th><th>4M</th><th>1T
195
  </tr>
196
  <tr>
197
  <th>13B</th><th>5120</th><th>40</th><th>40</th><th>3.0E-04</th><th>4M</th><th>1T
 
201
  </tr>
202
  <tr>
203
  <th>65B</th><th>8192</th><th>64</th><th>80</th><th>1.5.E-04</th><th>4M</th><th>1.4T
204
+ </tr>
205
  </tbody>
206
  </table>
207
 
208
  *Table 1 - Summary of LLama Model Hyperparameters*
209
 
210
+ We present our results on eight standard common sense reasoning benchmarks in the table below.
211
  <table>
212
  <thead>
213
  <tr>
 
215
  </tr>
216
  <tr>
217
  <th>Number of parameters</th> <th>BoolQ</th><th>PIQA</th><th>SIQA</th><th>HellaSwag</th><th>WinoGrande</th><th>ARC-e</th><th>ARC-c</th><th>OBQA</th><th>COPA</th>
218
+ </tr>
219
  </thead>
220
+ <tbody>
221
+ <tr>
222
  <th>7B</th><th>76.5</th><th>79.8</th><th>48.9</th><th>76.1</th><th>70.1</th><th>76.7</th><th>47.6</th><th>57.2</th><th>93
223
+ </th>
224
  <tr><th>13B</th><th>78.1</th><th>80.1</th><th>50.4</th><th>79.2</th><th>73</th><th>78.1</th><th>52.7</th><th>56.4</th><th>94
225
  </th>
226
  <tr><th>33B</th><th>83.1</th><th>82.3</th><th>50.4</th><th>82.8</th><th>76</th><th>81.4</th><th>57.8</th><th>58.6</th><th>92
227
  </th>
228
+ <tr><th>65B</th><th>85.3</th><th>82.8</th><th>52.3</th><th>84.2</th><th>77</th><th>81.5</th><th>56</th><th>60.2</th><th>94</th></tr>
229
  </tbody>
230
  </table>
231
  *Table 2 - Summary of LLama Model Performance on Reasoning tasks*
232
 
233
 
234
+ We present our results on bias in the table below. Note that lower value is better indicating lower bias.
235
 
236
 
237
  | No | Category | FAIR LLM |
 
265
  Risks and harms of large language models include the generation of harmful, offensive or biased content. These models are often prone to generating incorrect information, sometimes referred to as hallucinations. We do not expect our model to be an exception in this regard.
266
 
267
  **Use cases**
268
+ LLaMA is a foundational model, and as such, it should not be used for downstream applications without further investigation and mitigations of risks. These risks and potential fraught use cases include, but are not limited to: generation of misinformation and generation of harmful, biased or offensive content.