Weight updates?

#13
by brucethemoose - opened

I noticed the weights of this model got updated!

Rope Theta is different too.

What changed? Is the long context performance stronger now?

@brucethemoose We have indeed enhanced the capabilities of this model. In the "Needle-in-a-Haystack" test, the Yi-34B-200K's performance is improved by 10.5%, rising from 89.3% to an impressive 99.8%. You are always welcome to refer to the news section on our model card for the most updated and detailed information.

Pardon the pings to you guys, and forgive me if this is bad etiquette, but I thought this update might be worth the heads up just in case any of you have plans for future models down the pipeline so you can update beforehand (love your works btw!)

@jondurbin

@migtissera

@Sao10K

@teknium

I shall return to the background in peace and leave you guys be, thank you.

Would be good to see 01-ai do fine-tuning/DPO on 200k context like Nous Capybara -- quite good model

https://www.reddit.com/r/LocalLLaMA/comments/1b8dptk/new_rag_benchmark_with_claude_3_gemini_pro/?utm_source=share&utm_medium=web2x&context=3

Please consider giving the newly updated weights a new version number. There needs to be something that differentiates these weights from the originals.

Pardon the pings to you guys, and forgive me if this is bad etiquette, but I thought this update might be worth the heads up just in case any of you have plans for future models down the pipeline so you can update beforehand (love your works btw!)

Thanks for the heads up @ParasiticRogue

https://huggingface.co/01-ai/Yi-34B-200K/discussions/13#65e961b7d0cc7c76b925e133

Thank you for the ping! Appreciate it! I wouldn't have known otherwise. @ParasiticRogue

@MeisterDeLaV -- In the future, please release new versions guys, when you update. i.e. (Yi-34B-200K-v0.2), otherwise we don't know that there's been an update.

@migtissera @MarsupialAI Thank you for the suggestions, we will be working on it.

Sign up or log in to comment