rafi138 commited on
Commit
421d7c5
1 Parent(s): 950de1e

Add SetFit model

Browse files
Files changed (5) hide show
  1. README.md +76 -102
  2. config.json +2 -2
  3. config_setfit.json +2 -2
  4. model.safetensors +1 -1
  5. model_head.pkl +2 -2
README.md CHANGED
@@ -8,11 +8,11 @@ tags:
8
  metrics:
9
  - accuracy
10
  widget:
11
- - text: Nur Digital Studio
12
- - text: Sultanas Makeover And Training Center
13
- - text: Kajol Lota Restaurant
14
- - text: Loveria Cafe & Restaurant
15
- - text: Robiul And Brothers Departmental Store
16
  pipeline_tag: text-classification
17
  inference: true
18
  base_model: sentence-transformers/paraphrase-mpnet-base-v2
@@ -28,7 +28,7 @@ model-index:
28
  split: test
29
  metrics:
30
  - type: accuracy
31
- value: 0.48
32
  name: Accuracy
33
  ---
34
 
@@ -48,7 +48,7 @@ The model has been trained using an efficient few-shot learning technique that i
48
  - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
49
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
50
  - **Maximum Sequence Length:** 512 tokens
51
- - **Number of Classes:** 17 classes
52
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
53
  <!-- - **Language:** Unknown -->
54
  <!-- - **License:** Unknown -->
@@ -60,32 +60,43 @@ The model has been trained using an efficient few-shot learning technique that i
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
- | Label | Examples |
64
- |:----------------|:----------------------------------------------------------------------------------------------------------------------------------------------|
65
- | Bank | <ul><li>'Ific Bank Limited Sadar'</li><li>'Uttara Bank Limited Patuakhali Sadar'</li><li>'Eastern Bank Limited Uttara Branch (EBL)'</li></ul> |
66
- | Office | <ul><li>'Technometrics Limited - Banani Office'</li><li>'Land Survieur Vendor Office'</li><li>'Saint Maritn Travels'</li></ul> |
67
- | Religious Place | <ul><li>'Summa Ajmeri Khaja Baba Khanka Sharif'</li><li>'Baytul Mukaddas Jame Masjid'</li><li>'Paharpur Masjid'</li></ul> |
68
- | Education | <ul><li>'Shajalal Model Madrasa'</li><li>'Physics Private Care'</li><li>'Batikadanga Primar School'</li></ul> |
69
- | Recreation | <ul><li>'Surjo Dighol Resort'</li><li>'Bangladesh National Monument (Sriti Soudho)'</li><li>'Eco Park Jamun'</li></ul> |
70
- | Healthcare | <ul><li>'Nagar Shasthyo Bhaban'</li><li>'Laser Smile Dental Clinic'</li><li>'Haque Eye Care Centre'</li></ul> |
71
- | Agricultural | <ul><li>'Fram'</li><li>'Fruit Garden'</li></ul> |
72
- | Food | <ul><li>'Longhorn Steak & Pizza'</li><li>'Ghati Cha'</li><li>'Banaful And Con'</li></ul> |
73
- | Construction | <ul><li>'Shahjalal Sanitary'</li><li>'Modern Hardware And Paint'</li><li>'KLH Hardware'</li></ul> |
74
- | Industry | <ul><li>'Mka Enterprise'</li><li>'Firoz Indoor Fish Firm'</li><li>'Abdullah Industrial Park'</li></ul> |
75
- | Government | <ul><li>'Upazila Ansar And VDP Karjalay'</li><li>'Bof Officers Mess'</li><li>'Saheber Bazar Post Office'</li></ul> |
76
- | Transportation | <ul><li>'Cantonment Railway Station Dhaka'</li><li>'Mosharrof Counter'</li><li>'GR Transport Agency'</li></ul> |
77
- | Shop | <ul><li>'Kajol Watch Service'</li><li>'Glamour Parlour'</li><li>'Ma Baba Workshop'</li></ul> |
78
- | Residential | <ul><li>'Tri Noyon Villa'</li><li>'Mohammad Ali Sawdagar Colony'</li><li>'Afia Cottage'</li></ul> |
79
- | Hotel | <ul><li>'Hotel Bondor Ga'</li><li>'Hotel Moon Moon Abashik'</li><li>'Warisan Residential Hotel'</li></ul> |
80
- | Landmark | <ul><li>'Rampura Bazar Moar'</li><li>'Mohipal Square'</li></ul> |
81
- | Commercial | <ul><li>'Mohammadpur Geneva Camp Kacha Bazar'</li><li>'Mohila College Bhaban'</li><li>'Singer Plus Mohammadpur'</li></ul> |
 
 
 
 
 
 
 
 
 
 
 
82
 
83
  ## Evaluation
84
 
85
  ### Metrics
86
  | Label | Accuracy |
87
  |:--------|:---------|
88
- | **all** | 0.48 |
89
 
90
  ## Uses
91
 
@@ -105,7 +116,7 @@ from setfit import SetFitModel
105
  # Download from the 🤗 Hub
106
  model = SetFitModel.from_pretrained("rafi138/setfit-paraphrase-mpnet-base-v2-type")
107
  # Run inference
108
- preds = model("Nur Digital Studio")
109
  ```
110
 
111
  <!--
@@ -137,14 +148,14 @@ preds = model("Nur Digital Studio")
137
  ### Training Set Metrics
138
  | Training set | Min | Median | Max |
139
  |:-------------|:----|:-------|:----|
140
- | Word count | 1 | 3.5254 | 7 |
141
 
142
  | Label | Training Sample Count |
143
  |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------|
144
  | ShopCommercialGovernmentHealthcareEducationFoodOfficeReligious PlaceBankTransportationConstructionIndustryResidentialLandmarkRecreationFuelHotelUtilityAgricultural | 0 |
145
 
146
  ### Training Hyperparameters
147
- - batch_size: (16, 16)
148
  - num_epochs: (4, 4)
149
  - max_steps: -1
150
  - sampling_strategy: oversampling
@@ -163,84 +174,47 @@ preds = model("Nur Digital Studio")
163
  ### Training Results
164
  | Epoch | Step | Training Loss | Validation Loss |
165
  |:-------:|:-------:|:-------------:|:---------------:|
166
- | 0.0012 | 1 | 0.2662 | - |
167
- | 0.0613 | 50 | 0.2335 | - |
168
- | 0.1227 | 100 | 0.1324 | - |
169
- | 0.1840 | 150 | 0.1617 | - |
170
- | 0.2454 | 200 | 0.0733 | - |
171
- | 0.3067 | 250 | 0.0743 | - |
172
- | 0.3681 | 300 | 0.0186 | - |
173
- | 0.4294 | 350 | 0.0103 | - |
174
- | 0.4908 | 400 | 0.0214 | - |
175
- | 0.5521 | 450 | 0.0042 | - |
176
- | 0.6135 | 500 | 0.0062 | - |
177
- | 0.6748 | 550 | 0.0027 | - |
178
- | 0.7362 | 600 | 0.0021 | - |
179
- | 0.7975 | 650 | 0.0014 | - |
180
- | 0.8589 | 700 | 0.0016 | - |
181
- | 0.9202 | 750 | 0.0059 | - |
182
- | 0.9816 | 800 | 0.0009 | - |
183
- | **1.0** | **815** | **-** | **0.2969** |
184
- | 1.0429 | 850 | 0.0008 | - |
185
- | 1.1043 | 900 | 0.0014 | - |
186
- | 1.1656 | 950 | 0.0008 | - |
187
- | 1.2270 | 1000 | 0.001 | - |
188
- | 1.2883 | 1050 | 0.001 | - |
189
- | 1.3497 | 1100 | 0.0017 | - |
190
- | 1.4110 | 1150 | 0.0007 | - |
191
- | 1.4724 | 1200 | 0.0006 | - |
192
- | 1.5337 | 1250 | 0.0008 | - |
193
- | 1.5951 | 1300 | 0.0006 | - |
194
- | 1.6564 | 1350 | 0.0005 | - |
195
- | 1.7178 | 1400 | 0.0005 | - |
196
- | 1.7791 | 1450 | 0.001 | - |
197
- | 1.8405 | 1500 | 0.0005 | - |
198
- | 1.9018 | 1550 | 0.0006 | - |
199
- | 1.9632 | 1600 | 0.0005 | - |
200
- | 2.0 | 1630 | - | 0.3073 |
201
- | 2.0245 | 1650 | 0.0007 | - |
202
- | 2.0859 | 1700 | 0.0016 | - |
203
- | 2.1472 | 1750 | 0.0006 | - |
204
- | 2.2086 | 1800 | 0.0008 | - |
205
- | 2.2699 | 1850 | 0.0006 | - |
206
- | 2.3313 | 1900 | 0.0005 | - |
207
- | 2.3926 | 1950 | 0.0009 | - |
208
- | 2.4540 | 2000 | 0.0008 | - |
209
- | 2.5153 | 2050 | 0.0004 | - |
210
- | 2.5767 | 2100 | 0.0005 | - |
211
- | 2.6380 | 2150 | 0.0005 | - |
212
- | 2.6994 | 2200 | 0.0009 | - |
213
- | 2.7607 | 2250 | 0.0006 | - |
214
- | 2.8221 | 2300 | 0.0008 | - |
215
- | 2.8834 | 2350 | 0.0004 | - |
216
- | 2.9448 | 2400 | 0.0004 | - |
217
- | 3.0 | 2445 | - | 0.3198 |
218
- | 3.0061 | 2450 | 0.0003 | - |
219
- | 3.0675 | 2500 | 0.0004 | - |
220
- | 3.1288 | 2550 | 0.0002 | - |
221
- | 3.1902 | 2600 | 0.0003 | - |
222
- | 3.2515 | 2650 | 0.0004 | - |
223
- | 3.3129 | 2700 | 0.0005 | - |
224
- | 3.3742 | 2750 | 0.0003 | - |
225
- | 3.4356 | 2800 | 0.0003 | - |
226
- | 3.4969 | 2850 | 0.0005 | - |
227
- | 3.5583 | 2900 | 0.0006 | - |
228
- | 3.6196 | 2950 | 0.0005 | - |
229
- | 3.6810 | 3000 | 0.0007 | - |
230
- | 3.7423 | 3050 | 0.0004 | - |
231
- | 3.8037 | 3100 | 0.0003 | - |
232
- | 3.8650 | 3150 | 0.0005 | - |
233
- | 3.9264 | 3200 | 0.0003 | - |
234
- | 3.9877 | 3250 | 0.0007 | - |
235
- | 4.0 | 3260 | - | 0.3176 |
236
 
237
  * The bold row denotes the saved checkpoint.
238
  ### Framework Versions
239
  - Python: 3.10.12
240
  - SetFit: 1.0.3
241
  - Sentence Transformers: 2.2.2
242
- - Transformers: 4.36.2
243
- - PyTorch: 2.1.2+cu121
244
  - Datasets: 2.16.1
245
  - Tokenizers: 0.15.0
246
 
 
8
  metrics:
9
  - accuracy
10
  widget:
11
+ - text: Dadon Hotel
12
+ - text: Joyi Homeo Hall
13
+ - text: Masum Egg Supplier
14
+ - text: Salam Automobiles
15
+ - text: Shoumik Enterprise
16
  pipeline_tag: text-classification
17
  inference: true
18
  base_model: sentence-transformers/paraphrase-mpnet-base-v2
 
28
  split: test
29
  metrics:
30
  - type: accuracy
31
+ value: 0.33
32
  name: Accuracy
33
  ---
34
 
 
48
  - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
49
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
50
  - **Maximum Sequence Length:** 512 tokens
51
+ - **Number of Classes:** 28 classes
52
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
53
  <!-- - **Language:** Unknown -->
54
  <!-- - **License:** Unknown -->
 
60
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
 
62
  ### Model Labels
63
+ | Label | Examples |
64
+ |:-----------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|
65
+ | Relegious | <ul><li>'Badc Jame Masjid'</li><li>'Modina Masjid'</li><li>'Baitul Ehsan Jame Masjid'</li></ul> |
66
+ | Food | <ul><li>'Bombay Biriyani Restaurant'</li><li>'Sanim Ghorowa Reatora'</li><li>'Attel Mati Restaurant'</li></ul> |
67
+ | Religious PLAce | <ul><li>'Darbar Sharif(Dorbeshe Badsha)'</li><li>'Mazar'</li></ul> |
68
+ | Education | <ul><li>'The English Academy'</li><li>'Economics Batch'</li><li>'Al Manar Model School'</li></ul> |
69
+ | Health Care | <ul><li>'Hope Haspital'</li><li>'North Para Community Clinic'</li><li>'Al Sami Medical Hall'</li></ul> |
70
+ | Office | <ul><li>'Nari Maitri Dholpur Branch'</li><li>'Techsam IT And Computer'</li><li>'Chandpur It'</li></ul> |
71
+ | Landmark | <ul><li>'Godaun Moar'</li><li>'Kuril Flyover U Turn Bridge'</li><li>'Manik Miya Avenue Moar'</li></ul> |
72
+ | Fuel | <ul><li>'Mimi Enterprise'</li><li>'Sariful Filling Station'</li><li>'M/s Aruja Enterprise'</li></ul> |
73
+ | Religious Place | <ul><li>'Kabbir Khan Jame Masjid'</li><li>'Sri Sri Nayanta Babar Mandir'</li><li>'Jordan Church of Christ'</li></ul> |
74
+ | Transportation | <ul><li>'Lala Khal Ferry Terminal'</li><li>'Porshuram Cng Stand'</li><li>'Riad Cycle Garage'</li></ul> |
75
+ | Agricultural | <ul><li>'Catlle Farm'</li><li>'Pushon Narsari'</li><li>'Vegetable garden'</li></ul> |
76
+ | Residential | <ul><li>'Ovinondon Chattrabas'</li><li>'TH Chattrabas'</li><li>'Seven Star Chattrabas'</li></ul> |
77
+ | shop | <ul><li>'Mayer Doya Store'</li></ul> |
78
+ | Bank | <ul><li>'Dutch Bangla Bank Limited Maijde (DBBL)'</li><li>'Jamuna Bank Limited Dholaikhal Branch'</li><li>'Prime Bank Limited Elephant Branch'</li></ul> |
79
+ | Utility | <ul><li>'Shahi Eidgah Water Tank'</li><li>'Pole No 31'</li><li>'Kalmilata Kacha Bazar'</li></ul> |
80
+ | Healthcare | <ul><li>'Oloukik'</li><li>'Burhanuddin Upazila Health Complex'</li><li>'Dr Nazmin Akter Najma'</li></ul> |
81
+ | Government | <ul><li>'Zilla Parishad Karjaloy Bhola'</li><li>"Sub Police Commissioner's Bhaban (Tejgaon Branch)"</li><li>'Family Planning Office Satkhira'</li></ul> |
82
+ | Recreation | <ul><li>'Shaikh Rasel Sriti Shongho'</li><li>'Beraid Camping And Kayaking Zone (BCKZ)'</li><li>'Shohag Palli Picnic Spot & Resort'</li></ul> |
83
+ | Religious | <ul><li>'Baitul Mamur Jame Masjid'</li><li>'Petrol Pump Jame Masjid'</li><li>'Opsonnin Pharma Ltd Jame Masjid'</li></ul> |
84
+ | Religious Place | <ul><li>'Jame Masjid'</li><li>'Hospital Masjid'</li><li>'Badar Mokam Jame Masjid'</li></ul> |
85
+ | Shop | <ul><li>'Nayeem General Store'</li><li>'Bazlu Engineering & Refrigeration'</li><li>'Mukta Dulal'</li></ul> |
86
+ | Commercial | <ul><li>'Mazar Kacha Bazar'</li><li>'Fall Bazar Kola Potti'</li><li>'Venus Autos'</li></ul> |
87
+ | Industry | <ul><li>'Rn Integrated Argo'</li><li>'Fresh Dairy Firm'</li><li>'Hemple Rhee Mfg Limited'</li></ul> |
88
+ | Hotel | <ul><li>'Warisan'</li><li>'Hotel New London Palace Abashik'</li><li>'Sada Vat'</li></ul> |
89
+ | construction | <ul><li>'Fahim Hardware Store'</li><li>'O A Frame Gallery'</li></ul> |
90
+ | Construction | <ul><li>'Khalil Steel'</li><li>'Sanaullah Tiles And Sanitary House'</li><li>'Mukta Glass And Thai Aluminum'</li></ul> |
91
+ | Relegious Place | <ul><li>'Baitul Atiq Jam-E Masjid'</li><li>'Hathazari Bus Stand Baitussalam Jame Masjid'</li><li>'Osman Bin Affan Jame Masjid'</li></ul> |
92
+ | education | <ul><li>'Masum Electronic'</li></ul> |
93
 
94
  ## Evaluation
95
 
96
  ### Metrics
97
  | Label | Accuracy |
98
  |:--------|:---------|
99
+ | **all** | 0.33 |
100
 
101
  ## Uses
102
 
 
116
  # Download from the 🤗 Hub
117
  model = SetFitModel.from_pretrained("rafi138/setfit-paraphrase-mpnet-base-v2-type")
118
  # Run inference
119
+ preds = model("Dadon Hotel")
120
  ```
121
 
122
  <!--
 
148
  ### Training Set Metrics
149
  | Training set | Min | Median | Max |
150
  |:-------------|:----|:-------|:----|
151
+ | Word count | 1 | 3.5 | 7 |
152
 
153
  | Label | Training Sample Count |
154
  |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------|
155
  | ShopCommercialGovernmentHealthcareEducationFoodOfficeReligious PlaceBankTransportationConstructionIndustryResidentialLandmarkRecreationFuelHotelUtilityAgricultural | 0 |
156
 
157
  ### Training Hyperparameters
158
+ - batch_size: (32, 32)
159
  - num_epochs: (4, 4)
160
  - max_steps: -1
161
  - sampling_strategy: oversampling
 
174
  ### Training Results
175
  | Epoch | Step | Training Loss | Validation Loss |
176
  |:-------:|:-------:|:-------------:|:---------------:|
177
+ | 0.0006 | 1 | 0.1851 | - |
178
+ | 0.0282 | 50 | 0.1697 | - |
179
+ | 0.0564 | 100 | 0.1876 | - |
180
+ | 0.0032 | 1 | 0.169 | - |
181
+ | 0.1597 | 50 | 0.081 | - |
182
+ | 0.3195 | 100 | 0.0641 | - |
183
+ | 0.4792 | 150 | 0.033 | - |
184
+ | 0.6390 | 200 | 0.0128 | - |
185
+ | 0.7987 | 250 | 0.0089 | - |
186
+ | 0.9585 | 300 | 0.0106 | - |
187
+ | **1.0** | **313** | **-** | **0.3235** |
188
+ | 1.1182 | 350 | 0.0215 | - |
189
+ | 1.2780 | 400 | 0.017 | - |
190
+ | 1.4377 | 450 | 0.0057 | - |
191
+ | 1.5974 | 500 | 0.0047 | - |
192
+ | 1.7572 | 550 | 0.0064 | - |
193
+ | 1.9169 | 600 | 0.003 | - |
194
+ | 2.0 | 626 | - | 0.3481 |
195
+ | 2.0767 | 650 | 0.0043 | - |
196
+ | 2.2364 | 700 | 0.0022 | - |
197
+ | 2.3962 | 750 | 0.0014 | - |
198
+ | 2.5559 | 800 | 0.0028 | - |
199
+ | 2.7157 | 850 | 0.0018 | - |
200
+ | 2.8754 | 900 | 0.002 | - |
201
+ | 3.0 | 939 | - | 0.3393 |
202
+ | 3.0351 | 950 | 0.0294 | - |
203
+ | 3.1949 | 1000 | 0.002 | - |
204
+ | 3.3546 | 1050 | 0.0017 | - |
205
+ | 3.5144 | 1100 | 0.0017 | - |
206
+ | 3.6741 | 1150 | 0.0015 | - |
207
+ | 3.8339 | 1200 | 0.0013 | - |
208
+ | 3.9936 | 1250 | 0.0014 | - |
209
+ | 4.0 | 1252 | - | 0.348 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
210
 
211
  * The bold row denotes the saved checkpoint.
212
  ### Framework Versions
213
  - Python: 3.10.12
214
  - SetFit: 1.0.3
215
  - Sentence Transformers: 2.2.2
216
+ - Transformers: 4.35.2
217
+ - PyTorch: 2.1.0+cu121
218
  - Datasets: 2.16.1
219
  - Tokenizers: 0.15.0
220
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "checkpoints/step_815/",
3
  "architectures": [
4
  "MPNetModel"
5
  ],
@@ -19,6 +19,6 @@
19
  "pad_token_id": 1,
20
  "relative_attention_num_buckets": 32,
21
  "torch_dtype": "float32",
22
- "transformers_version": "4.36.2",
23
  "vocab_size": 30527
24
  }
 
1
  {
2
+ "_name_or_path": "checkpoints/step_313/",
3
  "architectures": [
4
  "MPNetModel"
5
  ],
 
19
  "pad_token_id": 1,
20
  "relative_attention_num_buckets": 32,
21
  "torch_dtype": "float32",
22
+ "transformers_version": "4.35.2",
23
  "vocab_size": 30527
24
  }
config_setfit.json CHANGED
@@ -1,6 +1,6 @@
1
  {
 
2
  "labels": [
3
  "ShopCommercialGovernmentHealthcareEducationFoodOfficeReligious PlaceBankTransportationConstructionIndustryResidentialLandmarkRecreationFuelHotelUtilityAgricultural"
4
- ],
5
- "normalize_embeddings": false
6
  }
 
1
  {
2
+ "normalize_embeddings": false,
3
  "labels": [
4
  "ShopCommercialGovernmentHealthcareEducationFoodOfficeReligious PlaceBankTransportationConstructionIndustryResidentialLandmarkRecreationFuelHotelUtilityAgricultural"
5
+ ]
 
6
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bd1773f5fd2f194efddbd2719668e66db3d716fbedb9283623c4413a6f18c3de
3
  size 437967672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2cc47bdb0d72f19b10e2dc0acfd0a5e21fbca8b9b14563fbad0c2de0eb755962
3
  size 437967672
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ea13d8379d11439103aee052643d1978cc5c31b1ab881ae4e47aac094adb6515
3
- size 106447
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc983efe1cf6f01284d6eac615b9741cafe6513db5328cb5552cdebf45f12535
3
+ size 174871