EXTENDED LENGTH FRANKENSTEIN

#8
by ChuckMcSneed - opened

Is it possible to merge this fun model with yarn 70b for extended context? Or maybe to create a new frankenmodel with yarn?

You can try and find out. You only need enough RAM to load two 70B models at a time, can even swap to disk (would just be slower).

Do you have any suggstions on which layers and models to use?

UPDATE: I don't think that mergekit works with yarn.
ValueError: rope_scaling must be a dictionary with with two fields, type and factor, got {'factor': 8.0, 'finetuned': True, 'original_max_position_embeddings': 4096, 'type': 'yarn'}
And after modifying config of yarn:
ValueError: rope_scaling's type field must be one of ['linear', 'dynamic'], got yarn

Is it maybe possible to somehow merge llama and yi yi ass model or are they too different?

I finally made it. Merged 70b 32k model with itself. It actually works!
https://huggingface.co/ChuckMcSneed/DoubleGold-v0.1-123b-32k

Sign up or log in to comment