Add TF weights

by Rocketknight1 HF staff - opened May 10, 2023

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-0

Rocketknight1

May 10, 2023

Model converted by the transformers' pt_to_tf CLI. All converted model outputs and hidden layers were validated against its PyTorch counterpart.

Maximum crossload output difference=1.993e+00; Maximum crossload hidden layer difference=1.552e-04;
Maximum conversion output difference=1.991e+00; Maximum conversion hidden layer difference=1.552e-04;

CAUTION: The maximum admissible error was manually increased to 2.0!

Add TF weights1aacbd11

Rocketknight1

May 10, 2023

Quick note on this PR: The huge output difference is caused by the original checkpoint not having any pooler weights, which get randomly initialized separately in both PT and TF as a result. The actual difference between model outputs other than the pooler is ~1e-4, which is well within acceptable limits.

Rocketknight1 changed pull request status to merged May 10, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment