Part 6 of the model seems to be missing

#1
by dosb - opened

It seems like model-00006-of-00008.safetensors was not properly pushed to the repo.

If you try to download via browser, you get an AccessDenied error.

If you try to clone via git pull or git lfs pull, you get Object does not exist: [404] Object does not exist error: failed to fetch some objects from 'https://huggingface.co/dreamgen/opus-v1.2-70b-awq.git/info/lfs'.

Can someone from the DreamGen team please run git lfs push --all? Or check if there are some weird permissions on part 6?

DreamGen org
edited Mar 8

Oh yeah, something is off (I am getting 403 with some of the shards), the repo was created using HF's duplicate API. I will try to reupload and notify you here once the upload is complete.

DreamGen org

@dosb I reuploaded the model shards, and now I am able to download it. Please give it a try!

Thank you, I was able to successfully download.

Are you able to provide the context length of the models? Is it 4096?

DreamGen org

@dosb When I trained the model, I changed the rope_theta to 1000000 which helps it generalize to long sequences and I trained it on very long sequences as well (see the model card).
This means that you should not apply any rope scaling!

According my evals (not just perplexity, but actual end-to-end side-by-side comparisons with ground truth), the performance at ~8K seems comparable to the performance on shorter sequences.

All in all, I hope it will go way beyond 4K, and would love to hear what you find in your testing!

Thank you, for the response!

I'm finding it much harder to use opus-v1.2 vs opus-v0.5. With v1.2, it seems like it's quite prone to repetition.

Do you have suggested presets? I'm using Oobabooga (text gen webUI) with presets like "Llama precise" (https://github.com/oobabooga/text-generation-webui/blob/main/presets/LLaMA-Precise.yaml) and "Shortwave" (https://github.com/oobabooga/text-generation-webui/blob/main/presets/Shortwave.yaml), but I'm still running into the same issues.

DreamGen org

@dosb The v1 models adhere to a very different prompting template, it's an extension of ChatML.

You can find a lot of details in the fp16 model card and on the opus v1 documentation page. There is also Python code and Colab that showcases how to prompt, as well several concrete example input output and sandbox.

There is also preset for SillyTavern, which is now alo built into ST itself (you need to run on staging branch): https://docs.sillytavern.app/usage/api-connections/dreamgen/ (the prompting part of the guide applies also for running Opus V1 locally).

Sign up or log in to comment