TypeError: forward() got an unexpected keyword argument 'num_logits_to_keep'
#51 opened 3 months ago
by
shajiu
Adding Evaluation Results
#50 opened 3 months ago
by
leaderboard-pr-bot
AttributeError: 'HybridMambaAttentionDynamicCache' object has no attribute '_modules'
7
#48 opened 3 months ago
by
xxrjun
Adding Evaluation Results
#47 opened 4 months ago
by
leaderboard-pr-bot
ai21 instance not runnable with langchain
1
#45 opened 5 months ago
by
LordSahu
Is there any SFT or Chat model?
2
#41 opened 7 months ago
by
chuyi777
How to use accelerate evaluate Jamba
#40 opened 7 months ago
by
Xidong
Jamba Evaluation Task on GSM8K
#39 opened 7 months ago
by
ssparks
Do you have plans to release papers on Jamba's architecture or miniature models?
#38 opened 7 months ago
by
badrabbitt
Are there any weight files for pre-trained models?
#37 opened 7 months ago
by
aidenxy
Memory usage on single A100*80GB in training
#36 opened 7 months ago
by
DavidWu1116
Fast Mamba
5
#34 opened 8 months ago
by
Praneethkeerthi
Why does throughput increase with longer context window?
3
#33 opened 8 months ago
by
jingyu-q
Request: DOI
#32 opened 8 months ago
by
kozolex
GGUF quants?
1
#31 opened 8 months ago
by
6346y9uey
Any release plans for the 7b jamba model without MoE?
2
#30 opened 8 months ago
by
danielpark
Why is there an MLP in the Mamba Layer?
#28 opened 8 months ago
by
naston
Complex vs Real parametrization.
#27 opened 8 months ago
by
Yutida
How to Fine-tune Jamba on google Colab?
7
#26 opened 8 months ago
by
Ateeqq
Layer-Selective Rank Reduction
#25 opened 8 months ago
by
mizinovmv
Update README.md
#23 opened 8 months ago
by
rombodawg
Would there a chance Jamba to be train in 1.58bit weight?
1
#22 opened 8 months ago
by
shing3232
Anyone else currently experimenting with fine-tuning Jamba?
3
#21 opened 8 months ago
by
Severian
IndentationError: unindent does not match any outer indentation level
#19 opened 8 months ago
by
thebeline
ModuleNotFoundError: No module named 'transformers_modules.ai21labs.Jamba-v0'
5
#17 opened 8 months ago
by
hjewr
Fast Mamba kernels are not available
10
#16 opened 8 months ago
by
MohamedRashad
does all safe tensors needed to be downloaded to use this model on colab?
2
#14 opened 8 months ago
by
Kv-boii
How many pretraining tokens?
#13 opened 8 months ago
by
CyberNative
Smaller version to ease implementation experiments?
7
#12 opened 8 months ago
by
compilade
Coding performance of base model?
4
#11 opened 8 months ago
by
rombodawg
Can you give a short explanation about the benefits and the architecture?
2
#7 opened 8 months ago
by
SicariusSicariiStuff
A Bang Up Job
2
#4 opened 8 months ago
by
nightvision04
multiple gpu?
3
#3 opened 8 months ago
by
bdambrosio
Just a solid congrats and thank you to your team
1
#1 opened 8 months ago
by
Severian