Model converted by the
pt_to_tf CLI. All converted model outputs and hidden layers were validated against its Pytorch counterpart.
Maximum crossload output difference=1.386e-04; Maximum crossload hidden layer difference=3.052e-05;
Maximum conversion output difference=1.386e-04; Maximum conversion hidden layer difference=3.052e-05;
CAUTION: The maximum admissible error was manually increased to 0.0002!
@joaogante @lysandre relevant discussion as to why the threshold had to be adjusted: https://github.com/huggingface/transformers/pull/18555#issuecomment-1229703811