Edit model card

Cross-Encoder for MS Marco with TinyBert

This is a fine-tuned version of the model checkpointed at cross-encoder/ms-marco-TinyBert-L-2.

It was fine-tuned on html tags and labels generated using Fathom.

How to use this model in transformers

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="Mozilla/tinybert-uncased-autofill"
)

print(
    classifier('<input class="cc-number" placeholder="Enter credit card number..." />')
)

Model Training Info

HyperParameters: {
    'learning_rate': 0.000082,
    'num_train_epochs': 71,
    'weight_decay': 0.1,
    'per_device_train_batch_size': 32,
}

More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill

Model Performance

Test Performance:
Precision: 0.9653
Recall: 0.9648
F1: 0.9644

                     precision    recall  f1-score   support

      CC Expiration      1.000     0.625     0.769        16
CC Expiration Month      0.919     0.944     0.932        36
 CC Expiration Year      0.897     0.946     0.921        37
            CC Name      0.938     0.968     0.952        31
          CC Number      0.926     1.000     0.962        50
    CC Payment Type      0.903     0.867     0.884        75
   CC Security Code      0.975     0.951     0.963        41
            CC Type      0.917     0.786     0.846        14
   Confirm Password      0.911     0.895     0.903        57
              Email      0.933     0.959     0.946        73
         First Name      0.833     1.000     0.909         5
               Form      0.974     0.974     0.974        39
          Last Name      0.667     0.800     0.727         5
       New Password      0.929     0.938     0.933        97
              Other      0.985     0.985     0.985      1235
              Phone      1.000     0.667     0.800         3
           Zip Code      0.909     0.938     0.923        32

           accuracy                          0.965      1846
          macro avg      0.919     0.897     0.902      1846
       weighted avg      0.965     0.965     0.964      1846
Downloads last month
123
Safetensors
Model size
4.39M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Mozilla/tinybert-uncased-autofill

Quantized
(2)
this model

Dataset used to train Mozilla/tinybert-uncased-autofill