DavidAU commited on
Commit
1a2c385
·
verified ·
1 Parent(s): 26968b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -36,7 +36,7 @@ pipeline_tag: text-generation
36
 
37
  (quants uploading...)
38
 
39
- <h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1047 tensors) </h2>
40
 
41
  Context : 128k.
42
 
@@ -52,7 +52,7 @@ The "thinking/reasoning" tech (for the model at this repo) is from the original
52
 
53
  [ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
54
 
55
- In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to 72 layers, 16.5B parameters.
56
 
57
  ---
58
 
 
36
 
37
  (quants uploading...)
38
 
39
+ <h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1043 tensors) </h2>
40
 
41
  Context : 128k.
42
 
 
52
 
53
  [ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
54
 
55
+ In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to 88 layers, 25.5B parameters.
56
 
57
  ---
58