rskuzma commited on
Commit
3e954b4
1 Parent(s): e2b82a1

update blog link

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -1,8 +1,7 @@
1
  ---
2
  language:
3
  - en
4
- inference: false
5
- thumbnail: https://www.cerebras.net/wp-content/uploads/2022/05/Cerebras-Logo-Black.png
6
  tags:
7
  - pytorch
8
  - causal-lm
@@ -16,7 +15,7 @@ pipeline_tag: text-generation
16
 
17
  # BTLM-3B-8k-base
18
 
19
- Bittensor Language Model (BTLM-3B-8k-base) is a 3 billion parameter language model with an 8k context length trained on 627B tokens of [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B). BTLM-3B-8k-base sets a new standard for 3B parameter models, outperforming models trained on hundreds of billions more tokens and achieving comparable performance to open 7B parameter models. BTLM-3B-8k-base can also be quantized to 4-bit to fit in devices with as little as 3GB of memory. The model is made available with an Apache 2.0 license for commercial use.
20
 
21
  BTLM was trained by [Cerebras](https://www.cerebras.net/) in partnership with [Opentensor](https://opentensor.ai/) on the newly unveiled [Condor Galaxy 1 (CG-1) supercomputer](https://www.cerebras.net/blog/introducing-condor-galaxy-1-a-4-exaflop-supercomputer-for-generative-ai/), the first public deliverable of the G42-Cerebras strategic partnership.
22
 
@@ -128,7 +127,7 @@ Figure 4: Performance at 7B model size
128
  - Optimizer: AdamW
129
  - Positional Encoding: ALiBi
130
  - Language: English
131
- - Learn more: <TODO: link to blog>
132
  - Paper: Coming soon
133
 
134
  ## To continue training with PyTorch and Maximal Update Parameterization
 
1
  ---
2
  language:
3
  - en
4
+ inference: true
 
5
  tags:
6
  - pytorch
7
  - causal-lm
 
15
 
16
  # BTLM-3B-8k-base
17
 
18
+ [Bittensor Language Model (BTLM-3B-8k-base)](https://www.cerebras.net/blog/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/) is a 3 billion parameter language model with an 8k context length trained on 627B tokens of [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B). BTLM-3B-8k-base sets a new standard for 3B parameter models, outperforming models trained on hundreds of billions more tokens and achieving comparable performance to open 7B parameter models. BTLM-3B-8k-base can also be quantized to 4-bit to fit in devices with as little as 3GB of memory. The model is made available with an Apache 2.0 license for commercial use.
19
 
20
  BTLM was trained by [Cerebras](https://www.cerebras.net/) in partnership with [Opentensor](https://opentensor.ai/) on the newly unveiled [Condor Galaxy 1 (CG-1) supercomputer](https://www.cerebras.net/blog/introducing-condor-galaxy-1-a-4-exaflop-supercomputer-for-generative-ai/), the first public deliverable of the G42-Cerebras strategic partnership.
21
 
 
127
  - Optimizer: AdamW
128
  - Positional Encoding: ALiBi
129
  - Language: English
130
+ - Learn more: [BTLM-3B-8k blog post](https://www.cerebras.net/blog/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/)
131
  - Paper: Coming soon
132
 
133
  ## To continue training with PyTorch and Maximal Update Parameterization