Transformers
PyTorch
Inference Endpoints
yury-zyphra commited on
Commit
521a777
1 Parent(s): 621d8c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,6 +13,6 @@ license: apache-2.0
13
  <img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/aHpEc5tnCJShO2Kn0f637.png" width="900" height="900" />
14
 
15
  ## About
16
- We provide inference and generation code for our BlackMamba model in our github repository: https://github.com/Zyphra/BlackMamba
17
 
18
  BlackMamba is an novel architecture which combines state-space models (SSMs) with mixture of experts (MoE). It uses [Mamba](https://arxiv.org/abs/2312.00752) as its SSM block and [switch transformer](https://arxiv.org/abs/2101.03961) as its MoE block base. BlackMamba is extremely low latency for generation and inference, providing significant speedups over all of classical transformers, MoEs, and Mamba SSM models. Additionally, due to its SSM sequence mixer, BlackMamba retains linear compuational complexity in the sequence length.
 
13
  <img src="https://cdn-uploads.huggingface.co/production/uploads/65bc13717c6ad1994b6619e9/aHpEc5tnCJShO2Kn0f637.png" width="900" height="900" />
14
 
15
  ## About
16
+ We provide inference code for our BlackMamba model in our github repository: https://github.com/Zyphra/BlackMamba
17
 
18
  BlackMamba is an novel architecture which combines state-space models (SSMs) with mixture of experts (MoE). It uses [Mamba](https://arxiv.org/abs/2312.00752) as its SSM block and [switch transformer](https://arxiv.org/abs/2101.03961) as its MoE block base. BlackMamba is extremely low latency for generation and inference, providing significant speedups over all of classical transformers, MoEs, and Mamba SSM models. Additionally, due to its SSM sequence mixer, BlackMamba retains linear compuational complexity in the sequence length.