RonanMcGovern
commited on
Commit
•
bd76192
1
Parent(s):
5d22335
Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ A fine-tune of the [Mamba SlimPajama model](state-spaces/mamba-2.8b-slimpj)
|
|
11 |
- Some answers are given in a different language than the question. This is likely due to the mixed language nature of the OpenAssist dataset. However, this usually isn't a problem for stronger models.
|
12 |
- After roughly 3500 tokens of input, the model fails.
|
13 |
- The model is poor at coding tasks.
|
14 |
-
- Passkey retrieval works at up to around 3500 tokens, however, the model struggles to respond to anything but short questions/queries.
|
15 |
|
16 |
## Chat Fine-tuning Config:
|
17 |
All modules were trained except the following were frozen:
|
|
|
11 |
- Some answers are given in a different language than the question. This is likely due to the mixed language nature of the OpenAssist dataset. However, this usually isn't a problem for stronger models.
|
12 |
- After roughly 3500 tokens of input, the model fails.
|
13 |
- The model is poor at coding tasks.
|
14 |
+
- Passkey retrieval works at up to around 3500 tokens, however, the model struggles to respond to anything but short questions/queries. Note that this is NOT an issue with the [openhermes fine-tune](https://huggingface.co/clibrain/mamba-2.8b-instruct-openhermes)
|
15 |
|
16 |
## Chat Fine-tuning Config:
|
17 |
All modules were trained except the following were frozen:
|