Update TF weights
#2
by
joaogante
HF staff
- opened
Model converted by the transformers
' pt_to_tf
CLI.
All converted model outputs and hidden layers were validated against its Pytorch counterpart. Maximum crossload output difference=1.465e-03; Maximum converted output difference=1.465e-03.
@patrickvonplaten
the weights I've uploaded before were built with an MVP of the pt-to-tf
CLI, which was not converting (or checking) the model head. These weights have the model head converted properly.
Merging this PR unblocks the following GH PR. After we confirm that these weights unblock the PR above (through passing tests), we can push the conversion for other XGLM model sizes.
cc @Stancld
patrickvonplaten
changed pull request status to
merged
Thanks @joaogante