Converting To Flax

#64
by erfanzar - opened

hello i code the model again in Flax/Jax and convert the weights and everything is well when I'm working with Mpt-1B but that doesn't work well with Mpt-7B
is there any method or trick that is different between Mpt-1B and Mpt-7B I read the code many times and there are some small changes in the implementations but that's not my problem
Flax MosaicmlMpt

Sign up or log in to comment