Steelskull commited on
Commit
fe8e5fb
1 Parent(s): 37d50a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -26
README.md CHANGED
@@ -17,6 +17,7 @@ license: apache-2.0
17
  <div class="header">
18
  <h1>Aura-llama</h1> </div> <div class="info">
19
  <strong>there was an error in the first model config it has been fixed in the new repo <a href="https://huggingface.co/TheSkullery/Aura-llama-v1">TheSkullery/Aura-llama-v1</a> </strong>
 
20
  <img src="https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/QYpWMEXTe0_X3A7HyeBm0.webp" alt="Aura-llama image">
21
  <p>Now that the cute anime girl has your attention.</p>
22
  <p>Aura-llama is using the methodology presented by SOLAR for scaling LLMs called depth up-scaling (DUS), which encompasses architectural modifications with continued pretraining. Using the solar paper as a base, I integrated Llama-3 weights into the upscaled layers, and In the future plan to continue training the model.</p>
@@ -26,32 +27,6 @@ license: apache-2.0
26
  <li><a href="https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct">meta-llama/Meta-Llama-3-8B-Instruct</a></li>
27
  </ul>
28
  </div>
29
- <div class="update-section">
30
- <h2>Merged Evals (Has Not Been Finetuned, waiting for tests):</h2>
31
- <p>Aura-llama</p>
32
- <ul>
33
- <li>Avg: ?</li>
34
- <li>ARC: ?</li>
35
- <li>HellaSwag: ?</li>
36
- <li>MMLU: ?</li>
37
- <li>T-QA: ?</li>
38
- <li>Winogrande: ?</li>
39
- <li>GSM8K: ?</li>
40
- </ul>
41
- </div>
42
- <div class="update-section">
43
- <h2>🧩 Configuration</h2>
44
- <pre><code>
45
- slices:
46
- - sources:
47
- - model: meta-llama/Meta-Llama-3-8B-Instruct
48
- layer_range: [0, 23]
49
- - sources:
50
- - model: meta-llama/Meta-Llama-3-8B-Instruct
51
- layer_range: [7, 31]
52
- merge_method: passthrough
53
- dtype: bfloat16
54
- </code></pre>
55
  </div>
56
  </div>
57
  </body>
 
17
  <div class="header">
18
  <h1>Aura-llama</h1> </div> <div class="info">
19
  <strong>there was an error in the first model config it has been fixed in the new repo <a href="https://huggingface.co/TheSkullery/Aura-llama-v1">TheSkullery/Aura-llama-v1</a> </strong>
20
+ <p>^^^^^^^^^^^</p>
21
  <img src="https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/QYpWMEXTe0_X3A7HyeBm0.webp" alt="Aura-llama image">
22
  <p>Now that the cute anime girl has your attention.</p>
23
  <p>Aura-llama is using the methodology presented by SOLAR for scaling LLMs called depth up-scaling (DUS), which encompasses architectural modifications with continued pretraining. Using the solar paper as a base, I integrated Llama-3 weights into the upscaled layers, and In the future plan to continue training the model.</p>
 
27
  <li><a href="https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct">meta-llama/Meta-Llama-3-8B-Instruct</a></li>
28
  </ul>
29
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  </div>
31
  </div>
32
  </body>