Add TF weights

#1
by amyeroberts HF staff - opened

Model converted by the transformers' pt_to_tf CLI. All converted model outputs and hidden layers were validated against its Pytorch counterpart.

Maximum crossload output difference=3.171e-05; Maximum crossload hidden layer difference=5.812e-03;
Maximum conversion output difference=3.171e-05; Maximum conversion hidden layer difference=5.812e-03;

List of maximum output differences above the threshold (1e-19):
logits: 2.193e-05
cls_logits: 3.171e-05
distillation_logits: 2.527e-05

List of maximum hidden layer differences above the threshold (1e-19):
hidden_states[0]: 2.384e-05
hidden_states[1]: 6.151e-05
hidden_states[2]: 5.090e-05
hidden_states[3]: 6.008e-05
hidden_states[4]: 7.832e-05
hidden_states[5]: 2.384e-04
hidden_states[6]: 7.458e-04
hidden_states[7]: 1.521e-03
hidden_states[8]: 2.577e-03
hidden_states[9]: 3.968e-03
hidden_states[10]: 4.996e-03
hidden_states[11]: 5.812e-03
hidden_states[12]: 5.056e-03

amyeroberts changed pull request status to merged

Sign up or log in to comment