Text Generation
Transformers
Safetensors
llama
conversational
Inference Endpoints
text-generation-inference

Yi-34b-200k v2, in the cards?

#2
by SabinStargem - opened

A day or two ago, a new version of Yi-34b-200k has been released. Apparently, it has improved long-context abilities. Here is what the HF page says:

"In the "Needle-in-a-Haystack" test, the Yi-34B-200K's performance is improved by 10.5%, rising from 89.3% to an impressive 99.8%. We continue to pre-train the model on 5B tokens long-context data mixture and demonstrate a near-all-green performance. "

Indeed. I'm a bit bummed that I didn't wait the 8 days or so for this base model update, but I'll try to re-tune it once I have some extra funds and the time to do so.

That is more than fair. It would be nice if foundational model developers would release roadmaps, so that this sort of thing doesn't happen.

Sign up or log in to comment