Brainstorm 40x method developed by David_AU
What is this method? Where can I find more infos about it?
Information on Brainstorm 40x is on this page:
https://huggingface.co/DavidAU/L3-DARKEST-PLANET-16.5B-GGUF
(same methods used for "Rogue Creative")
And REDDIT POST I made:
Roughly it is a process where the conclusion layer of a model is duplicated and calibrated, in the case of this model 40 times. This is a delicate process, with umm... a lot of rules. I think I might hold the world record for highest PPL ever recorded while I "got my feet wet" developing this process: 1 billion+ perplexity when you ahh... do it wrong - and it only takes one "wrong move" for this to happen.
For this model in particular Brainstorm is mapped as blocks, with "intended disruption" to alter and extend the power of the root model (in this case Dark Planet 8B). Each layer/block interacts with each other block.
(there is more going on here too, this is rough summary)
The goal here is creative : prose uniqueness first and foremost.
Other brainstorm methods address logic/problem solving augmentation.
IE:
https://huggingface.co/DavidAU/Meta-Llama-3.1-Instruct-12.2B-BRAINSTORM-20x-FORM-8-GGUF
For this model (Darkest) only the conclusion layer of the root model (Dark Planet 8b) is used.
For the next two releases (also 40x) this changes drastically - both donor model and brainstorm process, including the implementation of multiple conclusion layers from multiple models.
The important part: Brainstorm is an adapter. It can be applied to almost any model.
Yes. I read.. but is not clear how. Can you post the step by step workflow? I wish to replicate it on very small models to see how it goes..
I don't have a workflow for this, most of this is in my head at the moment.
Literally learnt by trial and error - at lot of errors and then observations of results (and notes on these).
The doc for this would be 15-20 pages, never mind "the rules" which seems to be somewhat fluid.
It is a time factor here.
Add to this:
The methods I use do not cover all possible ways to use this method to its maximum potential. I am still learning the ropes here.
RE: Small models.
This method - Brainstorm, as it is as of this writing - only magnifies what is already in the model.
(there are exceptions: when conclusion layers from other models are input as part of this process // and Braincluster - an unreleased (but tested) method at this time.)
So far models 3B+ ; this works great. But not so much on "older 3B" models (just not the same level of engineering in these relative to today's models).
On testing Llama 3.2 1B ; it's effect was somewhat limited and requires more research. This could be the fact L3.2 1B only has 16 layers and/or how it is compressed to this size.
ok but how do you "duplicate and calibrate the conclusion layer of a model"?