Trelis
/

mamba-2.8b-slimpj-chat-4k

Inference Endpoints

Model card Files Files and versions Community

RonanMcGovern commited on Feb 1

Commit

bd76192

•

1 Parent(s): 5d22335

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ A fine-tune of the [Mamba SlimPajama model](state-spaces/mamba-2.8b-slimpj)
 - Some answers are given in a different language than the question. This is likely due to the mixed language nature of the OpenAssist dataset. However, this usually isn't a problem for stronger models.
 - After roughly 3500 tokens of input, the model fails.
 - The model is poor at coding tasks.
-- Passkey retrieval works at up to around 3500 tokens, however, the model struggles to respond to anything but short questions/queries.
 ## Chat Fine-tuning Config:
 All modules were trained except the following were frozen:

 - Some answers are given in a different language than the question. This is likely due to the mixed language nature of the OpenAssist dataset. However, this usually isn't a problem for stronger models.
 - After roughly 3500 tokens of input, the model fails.
 - The model is poor at coding tasks.
+- Passkey retrieval works at up to around 3500 tokens, however, the model struggles to respond to anything but short questions/queries. Note that this is NOT an issue with the [openhermes fine-tune](https://huggingface.co/clibrain/mamba-2.8b-instruct-openhermes)
 ## Chat Fine-tuning Config:
 All modules were trained except the following were frozen: