Update TF weights

#2
by joaogante HF staff - opened

Model converted by the transformers' pt_to_tf CLI.

All converted model outputs and hidden layers were validated against its Pytorch counterpart. Maximum crossload output difference=1.465e-03; Maximum converted output difference=1.465e-03.

@patrickvonplaten the weights I've uploaded before were built with an MVP of the pt-to-tf CLI, which was not converting (or checking) the model head. These weights have the model head converted properly.

Merging this PR unblocks the following GH PR. After we confirm that these weights unblock the PR above (through passing tests), we can push the conversion for other XGLM model sizes.

cc @Stancld

patrickvonplaten changed pull request status to merged

Sign up or log in to comment