Update README.md
Browse files
README.md
CHANGED
@@ -12,9 +12,11 @@ pipeline_tag: text-generation
|
|
12 |
|
13 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/Ph6ZvxwF7a0m_B5Su_EK7.webp" width="500" height="500">
|
14 |
|
15 |
-
# This is highly experimental and should be viewed as purely testing right now. Jamba has been very hard to train but I wanted to see how it did on one of the best datasets we have access to. I believe in transparent development so all *best* working iterations, even if they are a bit wonky, will be pushed here.
|
16 |
|
17 |
-
#
|
|
|
|
|
18 |
|
19 |
---
|
20 |
## Training
|
|
|
12 |
|
13 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/Ph6ZvxwF7a0m_B5Su_EK7.webp" width="500" height="500">
|
14 |
|
15 |
+
# This is highly experimental and should be viewed as purely testing right now. Jamba has been very hard to train but I wanted to see how it did on one of the best datasets we have access to. I believe in transparent development so all *best* working iterations, even if they are a bit wonky, will be pushed here.
|
16 |
|
17 |
+
# I've unfortunately gone way over budget and spent a significant amount of money over the past few days trying to figure the best way to fine-tune Jamba. New iterations may be sparse until Jamba is coverted to MLX or I find buried treasure somewhere. If you've downloaded it, feel free to provde any feedback so I can improve on the next training cycle! Thanks for checking it out.
|
18 |
+
|
19 |
+
*There's been limited testing so no example outputs yet*
|
20 |
|
21 |
---
|
22 |
## Training
|