File size: 2,081 Bytes
9136af9
 
 
6b5ec26
79d76a4
 
 
9136af9
 
 
 
 
 
79d76a4
9136af9
decfcac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79d76a4
 
 
decfcac
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
language:
- en
widget:
- text: uber for today
- text: airtime and data
- text: breakfast meeting with client
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- finance
- text-classification
- business
---
### Model Description
<p>This model is a fine tuned version of the <a href="https://huggingface.co/distilbert/distilbert-base-uncased">distilbert-base-uncased</a> model on Hugging face. The model is trained to classify payment notes for business owners into one of the following categories.</p>
<ol>
  <li>INVENTORY, SUPPLIES AND EQUIPMENT</li>
  <li>PROFESSIONAL SERVICES</li>
  <li>TRANSPORTATION AND TRAVEL</li>
  <li>UTILITIES</li>
  <li>EMPLOYEE BENEFITS AND COMPENSATION</li>
  <li>MEALS AND ENTERTAINMENT</li>
  <li>TAX PAYMENTS</li>
  <li>LEGAL AND COMPLIANCE FEES</li>
  <li>BUSINESS DEVELOPMENT AND INVESTMENT</li>
</ol>

### Base Model Description
<p>DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a self-supervised fashion, using the BERT base model as a teacher. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts using the BERT base model.</p>

### Training results
<table>
  <tr>
    <th>Epoch</th>
    <th>Training Loss</th>
    <th>Validation Loss</th>
    <th>Accuracy</th>
  </tr>
  <tr>
    <th>0</th>
    <th>No Log</th>
    <th>0.263793</th>
    <th>0.916230</th>
  </tr>
  <tr>
    <th>1</th>
    <th>No Log</th>
    <th>0.185122</th>
    <th>0.937173</th>
  </tr>
  <tr>
    <th>2</th>
    <th>0.318300</th>
    <th>0.191695</th>
    <th>0.937173</th>
  </tr>
</table>

### Training results
<p>Check out the training code at this <a href="https://github.com/samanthaKarungi/iotec-pay-model-bert/tree/main/model/training_and_evaluation">github repo</a></p>

### Framework versions
<ul>
  <li>Transformers 4.37.2</li>
  <li>PyTorch 2.2.0</li>
  <li>Datasets 2.17.1</li>
  <li>Tokenizers 0.15.2</li>
</ul>