ai21labs
/

Jamba-v0.1

Text Generation

Mixture of Experts

Model card Files Files and versions

ordagan commited on Apr 1, 2024

Commit

8ee14c3

·

verified ·

1 Parent(s): 86c5df0

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ Jamba is the first production-scale Mamba implementation, which opens up interes
 This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
-For full details of this model please read the [release blog post](https://www.ai21.com/blog/announcing-jamba).
 ## Model Details

 This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
+For full details of this model please read the [white paper](https://arxiv.org/abs/2403.19887) and the [release blog post](https://www.ai21.com/blog/announcing-jamba).
 ## Model Details