Awesome work!

by Severian - opened Apr 7, 2024

Apr 7, 2024

This is really great work, thanks for taking it on! I haven't fully tested it yet but it seems like a super promising avenue for fine-tuning and experimenting.

I was wondering if you'd be willing to share the modeling architecture script you created for this smaller Jamba. I am incredibly fascinated by the new SSM-Transformer but have the deep knowledge yet (still learning) to make one myself. It'd be awesome to see how you figured it out. No worries if you want to keep it private though, figured it was worth an ask : )

OxxoCodes

Owner Apr 7, 2024

Hey, @Severian , thank you I do appreciate it!

I can definitely share the script I used to prune the model! Not a problem at all. Don't have immediate cluster access at the moment, so give me a bit and I'll let you know once it's been uploaded to this repo.

Do be aware that the Jamba-v0.1 model, when loaded in full precision, requires a significant amount of memory to naively load. I personally load the model to CPU on a system with 512GB+ of RAM. If you don't have access to a system with these specifications, you'll likely need to load a quantized version of Jamba to replicate results.

OxxoCodes

Owner Apr 9, 2024

@Severian I've added the pruning script to this repo, you can view it here: https://huggingface.co/OxxoCodes/jamba-small-v1/blob/main/prune.py
I've also created a v2 using different layer mapping, feel free to check that out here: https://huggingface.co/OxxoCodes/jamba-small-v2

Let me know if you'd like to know anything else about the model, cheers! ☕

Severian

Apr 9, 2024

Thanks for sharing. This is really clever and such a great way to get more models from the core architecture. You are a mad genius
I'm going to try and train the V2 you dropped, I'll let you know how it goes!

OxxoCodes

Owner Apr 10, 2024

Thank you! Definitely let me know how the training run goes, currently planning to get one going myself but university work comes first (for now 😉)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment