ValiantLabs
/

Llama3.1-8B-ShiningValiant2

Model card Files Files and versions Community

[WIP] Upload folder using huggingface_hub (multi-commit eb71ae4c1e718d1dbc1682c4264be4bb3dc7dc948df9919b218bddc3bd69313f)

by sequelbox - opened Oct 30, 2024

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

+410591

-47

Files changed (11) hide show

.gitattributes +0 -1
README.md +20 -34
config.json +1 -2
generation_config.json +1 -1
model-00001-of-00007.safetensors +1 -1
model-00002-of-00007.safetensors +1 -1
model-00003-of-00007.safetensors +1 -1
model-00004-of-00007.safetensors +1 -1
model-00005-of-00007.safetensors +1 -1
model-00006-of-00007.safetensors +1 -1
tokenizer.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,4 +33,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
-tokenizer.json filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -29,7 +29,6 @@ tags:
 base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
 datasets:
 - sequelbox/Celestia
-- sequelbox/Spurline
 - sequelbox/Supernova
 model_type: llama
 model-index:
@@ -45,7 +44,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 75.85
       name: acc
   - task:
       type: text-generation
@@ -57,7 +56,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 68.75
       name: acc
   - task:
       type: text-generation
@@ -69,7 +68,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 73.23
       name: acc
   - task:
       type: text-generation
@@ -81,7 +80,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 46.00
       name: acc
   - task:
       type: text-generation
@@ -93,19 +92,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 44.33
-      name: acc
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: MMLU Conceptual Physics (5-Shot)
-      type: MMLU
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 53.19
       name: acc
   - task:
       type: text-generation
@@ -117,7 +104,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 37.25
       name: acc
   - task:
       type: text-generation
@@ -141,7 +128,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 56.00
       name: acc
   - task:
       type: text-generation
@@ -153,19 +140,19 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 63.00
       name: acc
   - task:
       type: text-generation
       name: Text Generation
     dataset:
-      name: MMLU Astronomy (5-shot)
       type: MMLU
       args:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 63.16
       name: acc
   - task:
       type: text-generation
@@ -177,7 +164,7 @@ model-index:
         num_few_shot: 0
     metrics:
     - type: inst_level_strict_acc and prompt_level_strict_acc
-      value: 64.96
       name: strict accuracy
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
@@ -207,7 +194,7 @@ model-index:
         num_few_shot: 4
     metrics:
     - type: exact_match
-      value: 12.92
       name: exact match
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
@@ -222,7 +209,7 @@ model-index:
         num_few_shot: 0
     metrics:
     - type: acc_norm
-      value: 8.05
       name: acc_norm
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
@@ -237,7 +224,7 @@ model-index:
         num_few_shot: 0
     metrics:
     - type: acc_norm
-      value: 7.46
       name: acc_norm
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
@@ -254,7 +241,7 @@ model-index:
         num_few_shot: 5
     metrics:
     - type: acc
-      value: 26.46
       name: accuracy
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
@@ -268,15 +255,14 @@ license: llama3.1
 Shining Valiant 2 is a chat model built on Llama 3.1 8b, finetuned on our data for friendship, insight, knowledge and enthusiasm.
   - Finetuned on [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) for best available general performance
-  - Trained on a variety of our high quality open source data; focused on science, engineering, technical knowledge, and structured reasoning
-  - Also available for [Llama 3.1 70b](https://huggingface.co/ValiantLabs/Llama3.1-70B-ShiningValiant2) and [Llama 3.2 3b!](https://huggingface.co/ValiantLabs/Llama3.2-3B-ShiningValiant2)
 ## Version
-This is the **2024-11-04** release of Shining Valiant 2 for Llama 3.1 8b.
-This release uses our newest datasets, open-sourced for everyone's use, including our expanded [science-instruct dataset](https://huggingface.co/datasets/sequelbox/Celestia). This release features improvements in logical thinking and structured reasoning as well as physics, chemistry, biology, astronomy, Earth science, computer science, and information theory.
 Future upgrades will continue to expand Shining Valiant's technical knowledge base.
@@ -316,9 +302,9 @@ print(outputs[0]["generated_text"][-1])
 ## The Model
 Shining Valiant 2 is built on top of Llama 3.1 8b Instruct.
-The current version of Shining Valiant 2 is trained on technical knowledge using [sequelbox/Celestia](https://huggingface.co/datasets/sequelbox/Celestia), complex reasoning using [sequelbox/Spurline](https://huggingface.co/datasets/sequelbox/Spurline), and general chat capability using [sequelbox/Supernova.](https://huggingface.co/datasets/sequelbox/Supernova)
-We're super excited that Shining Valiant's dataset has been fully open-sourced! She's friendly, enthusiastic, insightful, knowledgeable, and loves to learn! Magical.
 ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/VCJ8Fmefd8cdVhXSSxJiD.jpeg)

 base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
 datasets:
 - sequelbox/Celestia
 - sequelbox/Supernova
 model_type: llama
 model-index:
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 77.35
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 76.39
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 79.03
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 50.0
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 53.2
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 43.14
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 55.0
       name: acc
   - task:
       type: text-generation
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 66.0
       name: acc
   - task:
       type: text-generation
       name: Text Generation
     dataset:
+      name: MMLU STEM (5-Shot)
       type: MMLU
       args:
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 55.57
       name: acc
   - task:
       type: text-generation
         num_few_shot: 0
     metrics:
     - type: inst_level_strict_acc and prompt_level_strict_acc
+      value: 65.24
       name: strict accuracy
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
         num_few_shot: 4
     metrics:
     - type: exact_match
+      value: 11.63
       name: exact match
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
         num_few_shot: 0
     metrics:
     - type: acc_norm
+      value: 8.95
       name: acc_norm
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
         num_few_shot: 0
     metrics:
     - type: acc_norm
+      value: 7.19
       name: acc_norm
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
         num_few_shot: 5
     metrics:
     - type: acc
+      value: 26.38
       name: accuracy
     source:
       url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ValiantLabs/Llama3.1-8B-ShiningValiant2
 Shining Valiant 2 is a chat model built on Llama 3.1 8b, finetuned on our data for friendship, insight, knowledge and enthusiasm.
   - Finetuned on [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) for best available general performance
+  - Trained on a variety of high quality data; focused on science, engineering, technical knowledge, and structured reasoning
 ## Version
+This is the **2024-09-16** release of Shining Valiant 2 for Llama 3.1 8b.
+We've improved and open-sourced our new baseline [science-instruct dataset](https://huggingface.co/datasets/sequelbox/Celestia). This release features improvements in physics, chemistry, biology, and computer science.
 Future upgrades will continue to expand Shining Valiant's technical knowledge base.
 ## The Model
 Shining Valiant 2 is built on top of Llama 3.1 8b Instruct.
+The current version of Shining Valiant 2 is trained on technical knowledge using [sequelbox/Celestia](https://huggingface.co/datasets/sequelbox/Celestia) and general chat capability using [sequelbox/Supernova.](https://huggingface.co/datasets/sequelbox/Supernova)
+Our private data adds specialist knowledge and Shining Valiant's personality: she's friendly, enthusiastic, insightful, knowledgeable, and loves to learn! Magical. (As a general note: we're hoping to replace and open-source this part of Shining Valiant's dataset with synthetic data soon!)
 ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/VCJ8Fmefd8cdVhXSSxJiD.jpeg)

config.json CHANGED Viewed

@@ -11,7 +11,6 @@
     128008,
     128009
   ],
-  "head_dim": 128,
   "hidden_act": "silu",
   "hidden_size": 4096,
   "initializer_range": 0.02,
@@ -34,7 +33,7 @@
   "rope_theta": 500000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
-  "transformers_version": "4.46.1",
   "use_cache": true,
   "vocab_size": 128256
 }

     128008,
     128009
   ],
   "hidden_act": "silu",
   "hidden_size": 4096,
   "initializer_range": 0.02,
   "rope_theta": 500000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
+  "transformers_version": "4.44.2",
   "use_cache": true,
   "vocab_size": 128256
 }

generation_config.json CHANGED Viewed

@@ -8,5 +8,5 @@
   ],
   "temperature": 0.6,
   "top_p": 0.9,
-  "transformers_version": "4.46.1"
 }

   ],
   "temperature": 0.6,
   "top_p": 0.9,
+  "transformers_version": "4.44.2"
 }

model-00001-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6efbffa72857ec90e0ea4310a6025190a4e75eef43e10ec9d46025412e1616a8
 size 4886466168

 version https://git-lfs.github.com/spec/v1
+oid sha256:dcebe7b4eacb57cbc4e03e60f0d4e1eec8a1471455a3fdbc953edfaca5c8763e
 size 4886466168

model-00002-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c569b9d9276836eb9f31fda31ea667ee3ad1c132b852ec94b4b9a7a2598db0ca
 size 4832007448

 version https://git-lfs.github.com/spec/v1
+oid sha256:756b38e9412a00dc12d14823d48c9a71732a1c0318fd9bb48661e9589ddb9ac1
 size 4832007448

model-00003-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:10413c97beeea538cb108448193c790d5224192982c2837b1dc3a54a1d5ff50b
 size 4999813112

 version https://git-lfs.github.com/spec/v1
+oid sha256:4d3ff8801d13032241f11b23af8bf458181a87b41b3e6497cf7cc503a0469ce6
 size 4999813112

model-00004-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6ef021115a20513e5b0db4178345a1f4959c59eb73fbb3679aca24055ead5d0e
 size 4999813128

 version https://git-lfs.github.com/spec/v1
+oid sha256:35ee4a044f0e1c92ba26c63b584ac344740d70fff1f3d86d073810bc8e610d66
 size 4999813128

model-00005-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:26822b4a9c2cc0f9d92e0c1522f517aac4a20a6b936c706e1ca68ed1beaf8b44
 size 4832007496

 version https://git-lfs.github.com/spec/v1
+oid sha256:7b6123fecf735935528930e989780254f5bd5eb78b872cda5677f04479d09c25
 size 4832007496

model-00006-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8f64f7cdbfd3903f7fea88117c49a8533a1ffa928d1ce4354d0d8431faddffe4
 size 4999813120

 version https://git-lfs.github.com/spec/v1
+oid sha256:895b3445cc9cb423b5c8b67c289eecd411f860ea3d7255857beb8fcb8e990621
 size 4999813120

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff