What does each consolidated.0x.pt consist of? How to load model using them?

#54
by Keely0419 - opened

I guess one consolidated.0x.pt relates to one expert (am I correct?). But which part of weights does each of them contain?
Is it the expert 0x's weights + all shared weights? Or the expert 0x's weights + a part of the shared weights?

And, how to load a model using consolidated.xx.pt? Is it possible to only load several of consolidated.xx.pt files for inference?

So many confusions... thanks for any help in advance!

Sign up or log in to comment