Any chance for a Sam 65b?

#1
by disarmyouwitha - opened

Samantha is my favorite model to talk to.
I have really enjoyed this model and I really like how the 33b performs - I will be keeping my eye out for the 65b!

I see you just posted the samantha-data, so that is very interesting (Thank you!)

I read your post about how Samantha was created (using Fastchat,etc) for the lower models - will you be publishing a write-up for how you were able to do it in Qlora for the 65b?

Cognitive Computations org

I haven't been - I think first I'm gonna do Samantha-Falcon-40b, which they say performs better than Llama-65b

@ehartford check out this new model : guanaco-65B-GPTQ . its trained in less time with less memory . hope in your next training it will save your time and resources.

Cognitive Computations org
edited Jun 3, 2023

Oh yeah, I am considering qLoRA. I still need to find a good solid solution for training that way. But FastChat doesn't yet support it, and I use FastChat for training these conversational datasets.

Sign up or log in to comment