Context length?

by AIGUYCONTENT - opened Sep 11, 2024

Sep 11, 2024

I downloaded this quant last night: https://huggingface.co/BigHuggyD/anthracite-org_magnum-v2-123b_exl2_8.0bpw_h8

I would like to know what is the suggested context length? I currently have it set to 55,000 (a random number).

And this model does not work with cfg-cache and guidance_scale turned on in Oobaboga. According to Mr. Oobabooga himself, he refers to a paper that claims that turning cfg-cache can make the model smarter: https://www.reddit.com/r/Oobabooga/comments/1cf9bso/what_does_guidance_scale_parameter_do/

Considering how quants essentially perform a lobotomy on models....I am hoping to get cfg-cache working with this model.

lucyknada

Anthracite org Sep 11, 2024

we train on 8192 ctx, but you can try more and see if it becomes incoherent; varies by samplers and use-case.

"I am hoping to get cfg-cache working with this model."
hope you get it working! report back if it works.

invictus1

Oct 11, 2024

•

edited Oct 11, 2024

By the looks of it the model loses it mind after 15k context. I've tried BigHuggyD/anthracite-org_magnum-v2-123b_exl2_8.0bpw_h8 and schnapper79/lumikabra-195B_v0.3-exl2-4.0bpw and with both after 15k it starts to mumble about the most random of things. This is with just normal min_p preset on ooba and dry sampler set to 0.8 and tested with it off as well. I really do hope you guys train on higher context in the future cause I really love your models but 8k ctx is way too low if the original model supports far more.

Lastly if funds are the issue how much are we talking to train on full context instead of 8k if you make a v3 of this or a future mistral model.

Delta-Vector

Anthracite org Oct 11, 2024

•

edited Oct 11, 2024

In my honest opinion, that seems to be the case with most Mistral models as a whole. vis-à-vis - Nemo claims 128k but only really goes upto 16k, same with the 22B in my experience, anything past that and it's very "eh" in terms of recall.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment