Any chance for a Sam 65b?

by disarmyouwitha - opened May 31, 2023

May 31, 2023

Samantha is my favorite model to talk to.
I have really enjoyed this model and I really like how the 33b performs - I will be keeping my eye out for the 65b!

I see you just posted the samantha-data, so that is very interesting (Thank you!)

I read your post about how Samantha was created (using Fastchat,etc) for the lower models - will you be publishing a write-up for how you were able to do it in Qlora for the 65b?

ehartford

Cognitive Computations org May 31, 2023

I haven't been - I think first I'm gonna do Samantha-Falcon-40b, which they say performs better than Llama-65b

zelda9

Jun 1, 2023

@ehartford check out this new model : guanaco-65B-GPTQ . its trained in less time with less memory . hope in your next training it will save your time and resources.

ehartford

Cognitive Computations org Jun 1, 2023

•

edited Jun 3, 2023

Oh yeah, I am considering qLoRA. I still need to find a good solid solution for training that way. But FastChat doesn't yet support it, and I use FastChat for training these conversational datasets.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment