metadata

license: mit
tags:
  - generated_from_trainer
datasets:
  - banking77
metrics:
  - accuracy
widget:
  - text: 'Can I track the card you sent to me? '
    example_title: Card Arrival Example
  - text: Can you explain your exchange rate policy to me?
    example_title: Exchange Rate Example
  - text: I can't pay by my credit card
    example_title: Card Not Working Example
base_model: distilbert-base-uncased
model-index:
  - name: distilbert-base-uncased-banking77-classification
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: banking77
          type: banking77
          args: default
        metrics:
          - type: accuracy
            value: 0.924025974025974
            name: Accuracy
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: banking77
          type: banking77
          config: default
          split: test
        metrics:
          - type: accuracy
            value: 0.924025974025974
            name: Accuracy
            verified: true
          - type: precision
            value: 0.9278003086307286
            name: Precision Macro
            verified: true
          - type: precision
            value: 0.924025974025974
            name: Precision Micro
            verified: true
          - type: precision
            value: 0.9278003086307287
            name: Precision Weighted
            verified: true
          - type: recall
            value: 0.9240259740259743
            name: Recall Macro
            verified: true
          - type: recall
            value: 0.924025974025974
            name: Recall Micro
            verified: true
          - type: recall
            value: 0.924025974025974
            name: Recall Weighted
            verified: true
          - type: f1
            value: 0.9243068139192414
            name: F1 Macro
            verified: true
          - type: f1
            value: 0.924025974025974
            name: F1 Micro
            verified: true
          - type: f1
            value: 0.9243068139192416
            name: F1 Weighted
            verified: true
          - type: loss
            value: 0.31516405940055847
            name: loss
            verified: true

distilbert-base-uncased-banking77-classification

This model is a fine-tuned version of distilbert-base-uncased on the banking77 dataset. It achieves the following results on the evaluation set:

Loss: 0.3152
Accuracy: 0.9240
F1 Score: 0.9243

Model description

This is my first fine-tuning experiment using Hugging Face. Using distilBERT as a pretrained model, I trained a classifier for online banking queries. It could be useful for addressing tickets.

Intended uses & limitations

The model can be used on text classification. In particular is fine tuned on banking domain.

Training and evaluation data

The dataset used is banking77

The 77 labels are:

label	intent
0	activate_my_card
1	age_limit
2	apple_pay_or_google_pay
3	atm_support
4	automatic_top_up
5	balance_not_updated_after_bank_transfer
6	balance_not_updated_after_cheque_or_cash_deposit
7	beneficiary_not_allowed
8	cancel_transfer
9	card_about_to_expire
10	card_acceptance
11	card_arrival
12	card_delivery_estimate
13	card_linking
14	card_not_working
15	card_payment_fee_charged
16	card_payment_not_recognised
17	card_payment_wrong_exchange_rate
18	card_swallowed
19	cash_withdrawal_charge
20	cash_withdrawal_not_recognised
21	change_pin
22	compromised_card
23	contactless_not_working
24	country_support
25	declined_card_payment
26	declined_cash_withdrawal
27	declined_transfer
28	direct_debit_payment_not_recognised
29	disposable_card_limits
30	edit_personal_details
31	exchange_charge
32	exchange_rate
33	exchange_via_app
34	extra_charge_on_statement
35	failed_transfer
36	fiat_currency_support
37	get_disposable_virtual_card
38	get_physical_card
39	getting_spare_card
40	getting_virtual_card
41	lost_or_stolen_card
42	lost_or_stolen_phone
43	order_physical_card
44	passcode_forgotten
45	pending_card_payment
46	pending_cash_withdrawal
47	pending_top_up
48	pending_transfer
49	pin_blocked
50	receiving_money
51	Refund_not_showing_up
52	request_refund
53	reverted_card_payment?
54	supported_cards_and_currencies
55	terminate_account
56	top_up_by_bank_transfer_charge
57	top_up_by_card_charge
58	top_up_by_cash_or_cheque
59	top_up_failed
60	top_up_limits
61	top_up_reverted
62	topping_up_by_card
63	transaction_charged_twice
64	transfer_fee_charged
65	transfer_into_account
66	transfer_not_received_by_recipient
67	transfer_timing
68	unable_to_verify_identity
69	verify_my_identity
70	verify_source_of_funds
71	verify_top_up
72	virtual_card_not_working
73	visa_or_mastercard
74	why_verify_identity
75	wrong_amount_of_cash_received
76	wrong_exchange_rate_for_cash_withdrawal

Training procedure

from transformers import pipeline

pipe = pipeline("text-classification", model="nickprock/distilbert-base-uncased-banking77-classification")
pipe("I can't pay by my credit card")

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1 Score
3.8732	1.0	157	3.1476	0.5370	0.4881
2.5598	2.0	314	1.9780	0.6916	0.6585
1.5863	3.0	471	1.2239	0.8042	0.7864
0.9829	4.0	628	0.8067	0.8565	0.8487
0.6274	5.0	785	0.5837	0.8799	0.8752
0.4304	6.0	942	0.4630	0.9042	0.9040
0.3106	7.0	1099	0.3982	0.9088	0.9087
0.2238	8.0	1256	0.3587	0.9110	0.9113
0.1708	9.0	1413	0.3351	0.9208	0.9208
0.1256	10.0	1570	0.3242	0.9179	0.9182
0.0981	11.0	1727	0.3136	0.9211	0.9214
0.0745	12.0	1884	0.3151	0.9211	0.9213
0.0601	13.0	2041	0.3089	0.9218	0.9220
0.0482	14.0	2198	0.3158	0.9214	0.9216
0.0402	15.0	2355	0.3126	0.9224	0.9226
0.0344	16.0	2512	0.3143	0.9231	0.9233
0.0298	17.0	2669	0.3156	0.9231	0.9233
0.0272	18.0	2826	0.3134	0.9244	0.9247
0.0237	19.0	2983	0.3156	0.9244	0.9246
0.0229	20.0	3140	0.3152	0.9240	0.9243

Framework versions

Transformers 4.20.1
Pytorch 1.12.0+cu113
Datasets 2.3.2
Tokenizers 0.12.1