lgd commited on
Commit
b41e541
1 Parent(s): ee45df2

Add SetFit model

Browse files
README.md CHANGED
@@ -13,39 +13,45 @@ tags:
13
  - text-classification
14
  - generated_from_setfit_trainer
15
  widget:
16
- - text: planning development plan environment natural environment land use development
17
- plan put forward frame reference aimed better knowing protecting promoting heritage
18
- data available set mainly come mapping section 21 23 31 urban urban plan montreal
19
- namely adaptation climate change territory ecological interest territory ecological
20
- interest green blue frame green blue frame well constraint nuisance urban planning
21
- development plan agglomeration montreal outline main parameter guide montreal
22
- agglomeration council decision relating land use planning coming year perspective
23
- sustainable development document guide decision shape territory order promote
24
- compact greener neighborhood increase public active transportation support economic
25
- dynamism agglomeration highlight area interest consult interactive map httpssmvtmapsarcgiscomappswebappviewerindexhtmlidd152aaa85b6f4e9086cecdf10c7456db
26
- planning development plan visualize thematic datathis third party metadata element
27
- translated using automated translation tool amazon translate formdescriptors natureandenvironment
28
- scienceandtechnology wood forest corridor green space falaise pente floodplain
29
- development planning diagram urbanism heat island government information
30
- - text: senior survey 2017
31
- - text: list permit exemption force law responsibility agency following merchant must
32
- licensed agency operate travel agent debt collector itinerant merchant solicit
33
- consumer order make sale make sale elsewhere business established ie doortodoor
34
- kiosk street mall etc retailer additional guarantee relating car motorcycle adapted
35
- transport public road operator health studio fitness center weight loss center
36
- example road vehicle dealer road vehicle recyclers retailer enter highcost credit
37
- contract retailer conclude highcost credit contract debt settlement service merchant
38
- negotiate consumer creditor receive amount distribute lender silver obligation
39
- trader must comply allow office ensure compliance legislative provision area activity
40
- risk considered significant license category linked financial protection mechanism
41
- consumer mechanism allow consumer compensated certain situation merchant valid
42
- license received authorization president office carry activity renewed permit
43
- scheduled date applicable certain category trader obtain exemption submitting
44
- bond effect exempting legal obligation particular depositing trust account money
45
- collected good whose delivery scheduled delivered two month purchase governmentandpolitics
46
- law retailer deliverance exemption permit consumer protection
47
- - text: ambient groundwater geochemistry data southwestern ontario
48
- - text: neighbourhood
 
 
 
 
 
 
49
  inference: false
50
  model-index:
51
  - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
@@ -59,16 +65,16 @@ model-index:
59
  split: test
60
  metrics:
61
  - type: accuracy
62
- value: 0.21
63
  name: Accuracy
64
  - type: precision
65
- value: 0.4350282485875706
66
  name: Precision
67
  - type: recall
68
- value: 0.652542372881356
69
  name: Recall
70
  - type: f1
71
- value: 0.5220338983050848
72
  name: F1
73
  ---
74
 
@@ -104,7 +110,7 @@ The model has been trained using an efficient few-shot learning technique that i
104
  ### Metrics
105
  | Label | Accuracy | Precision | Recall | F1 |
106
  |:--------|:---------|:----------|:-------|:-------|
107
- | **all** | 0.21 | 0.4350 | 0.6525 | 0.5220 |
108
 
109
  ## Uses
110
 
@@ -124,7 +130,7 @@ from setfit import SetFitModel
124
  # Download from the 🤗 Hub
125
  model = SetFitModel.from_pretrained("lgd/setfit-multilabel")
126
  # Run inference
127
- preds = model("neighbourhood")
128
  ```
129
 
130
  <!--
@@ -156,11 +162,11 @@ preds = model("neighbourhood")
156
  ### Training Set Metrics
157
  | Training set | Min | Median | Max |
158
  |:-------------|:----|:-------|:----|
159
- | Word count | 1 | 4.35 | 11 |
160
 
161
  ### Training Hyperparameters
162
  - batch_size: (16, 16)
163
- - num_epochs: (3, 3)
164
  - max_steps: -1
165
  - sampling_strategy: oversampling
166
  - num_iterations: 20
@@ -179,52 +185,24 @@ preds = model("neighbourhood")
179
  ### Training Results
180
  | Epoch | Step | Training Loss | Validation Loss |
181
  |:-----:|:----:|:-------------:|:---------------:|
182
- | 0.004 | 1 | 0.3342 | - |
183
- | 0.2 | 50 | 0.1221 | - |
184
- | 0.4 | 100 | 0.0837 | - |
185
- | 0.6 | 150 | 0.0403 | - |
186
- | 0.8 | 200 | 0.0798 | - |
187
- | 1.0 | 250 | 0.0282 | - |
188
- | 0.004 | 1 | 0.0266 | - |
189
- | 0.2 | 50 | 0.0102 | - |
190
- | 0.4 | 100 | 0.0501 | - |
191
- | 0.6 | 150 | 0.0297 | - |
192
- | 0.8 | 200 | 0.066 | - |
193
- | 1.0 | 250 | 0.0302 | - |
194
- | 0.004 | 1 | 0.0151 | - |
195
- | 0.2 | 50 | 0.0232 | - |
196
- | 0.4 | 100 | 0.017 | - |
197
- | 0.6 | 150 | 0.0133 | - |
198
- | 0.8 | 200 | 0.0629 | - |
199
- | 1.0 | 250 | 0.0349 | - |
200
- | 1.2 | 300 | 0.0585 | - |
201
- | 1.4 | 350 | 0.0658 | - |
202
- | 1.6 | 400 | 0.0446 | - |
203
- | 1.8 | 450 | 0.0073 | - |
204
- | 2.0 | 500 | 0.0326 | - |
205
- | 0.004 | 1 | 0.017 | - |
206
- | 0.2 | 50 | 0.0038 | - |
207
- | 0.4 | 100 | 0.0095 | - |
208
- | 0.6 | 150 | 0.0154 | - |
209
- | 0.8 | 200 | 0.0444 | - |
210
- | 1.0 | 250 | 0.0221 | - |
211
- | 1.2 | 300 | 0.0362 | - |
212
- | 1.4 | 350 | 0.0565 | - |
213
- | 1.6 | 400 | 0.0338 | - |
214
- | 1.8 | 450 | 0.0081 | - |
215
- | 2.0 | 500 | 0.0299 | - |
216
- | 2.2 | 550 | 0.106 | - |
217
- | 2.4 | 600 | 0.0191 | - |
218
- | 2.6 | 650 | 0.0104 | - |
219
- | 2.8 | 700 | 0.0369 | - |
220
- | 3.0 | 750 | 0.024 | - |
221
 
222
  ### Framework Versions
223
  - Python: 3.10.12
224
  - SetFit: 1.0.3
225
  - Sentence Transformers: 3.0.1
226
  - Transformers: 4.39.0
227
- - PyTorch: 2.3.0+cu121
228
  - Datasets: 2.20.0
229
  - Tokenizers: 0.15.2
230
 
 
13
  - text-classification
14
  - generated_from_setfit_trainer
15
  widget:
16
+ - text: weather satellite imagery update every 10 minute cloud top temperature colorized
17
+ reveal area intensity lower level transparent satellite imagery combine data noaa
18
+ go east west satellite jma himawari satellite providing full coverage weather
19
+ event world west coast africa west east coast india tile service update recent
20
+ image every 10 minute 15 km per pixel resolution infrared ir band detects radiation
21
+ emitted earth???s surface atmosphere cloud ??·infrared window??? portion spectrum
22
+ radiation wavelength near 103 micrometer term ??·window??? mean pass atmosphere
23
+ relatively little absorption gas water vapor useful estimating emitting temperature
24
+ earth???s surface cloud top major advantage ir band sense energy night imagery
25
+ available 24 hour day advanced baseline imager abi instrument sample radiance
26
+ earth sixteen spectral band using several array detector instrument???s focal
27
+ plane single reflective band abi level 1b radiance product channel 1 6 approximate
28
+ center wavelength 047 064 0865 1378 161 225 micron respectively digital map outgoing
29
+ radiance value top atmosphere visible nearinfrared ir band single emissive band
30
+ abi l1b radiance product channel 7 16 approximate center wavelength 39 6185 695
31
+ 734 85 961 1035 112 123 133 micron respectively digital map outgoing radiance
32
+ value top atmosphere ir band detector sample compressed packetized downlinked
33
+ ground station level 0 data conversion calibrated geolocated pixel level 1b radiance
34
+ data detector sample decompressed radiometrically corrected navigated resampled
35
+ onto invariant output grid referred abi fixed grid
36
+ - text: pipeline operator conducting risk assessment use ecological usa conjunction
37
+ pipeline information data identify area may suffer longterm permanent environmentalresource
38
+ damage event hazardous liquid pipeline accident user data encouraged read carefully
39
+ technical report cited cross reference section understand limitation ecological
40
+ usa data dataset comprises unusually sensitive area usa data ecological resource
41
+ state wyoming accordance pipeline safety law 49 usc section 60109 phmsa required
42
+ identify area unusually sensitive environmental damage event hazardous liquid
43
+ pipeline accident interaction various regulatory agency pipeline operator private
44
+ contractor nonprofit conservation organization general public process developed
45
+ adopted phmsa identify usa ecological resource process consists identifying set
46
+ candidate ecological resource using approved data source subjecting candidate
47
+ set filter criterion determine usa identification usa conducted using standardized
48
+ data processing step automated gi model resultant usa data applicable current
49
+ future regulatory requirement specified phmsa including limited pipeline integrity
50
+ management spill response planning additional information concerning ecological
51
+ usa please refer document listed cross reference section report
52
+ - text: southern ontario land resource information system solris 20
53
+ - text: toronto employment survey summary table
54
+ - text: cordon data directional traffic count
55
  inference: false
56
  model-index:
57
  - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
 
65
  split: test
66
  metrics:
67
  - type: accuracy
68
+ value: 0.295
69
  name: Accuracy
70
  - type: precision
71
+ value: 0.41697416974169743
72
  name: Precision
73
  - type: recall
74
+ value: 0.5044642857142857
75
  name: Recall
76
  - type: f1
77
+ value: 0.45656565656565656
78
  name: F1
79
  ---
80
 
 
110
  ### Metrics
111
  | Label | Accuracy | Precision | Recall | F1 |
112
  |:--------|:---------|:----------|:-------|:-------|
113
+ | **all** | 0.295 | 0.4170 | 0.5045 | 0.4566 |
114
 
115
  ## Uses
116
 
 
130
  # Download from the 🤗 Hub
131
  model = SetFitModel.from_pretrained("lgd/setfit-multilabel")
132
  # Run inference
133
+ preds = model("cordon data directional traffic count")
134
  ```
135
 
136
  <!--
 
162
  ### Training Set Metrics
163
  | Training set | Min | Median | Max |
164
  |:-------------|:----|:-------|:----|
165
+ | Word count | 1 | 4.55 | 11 |
166
 
167
  ### Training Hyperparameters
168
  - batch_size: (16, 16)
169
+ - num_epochs: (1, 1)
170
  - max_steps: -1
171
  - sampling_strategy: oversampling
172
  - num_iterations: 20
 
185
  ### Training Results
186
  | Epoch | Step | Training Loss | Validation Loss |
187
  |:-----:|:----:|:-------------:|:---------------:|
188
+ | 0.002 | 1 | 0.3892 | - |
189
+ | 0.1 | 50 | 0.2344 | - |
190
+ | 0.2 | 100 | 0.2476 | - |
191
+ | 0.3 | 150 | 0.0538 | - |
192
+ | 0.4 | 200 | 0.0805 | - |
193
+ | 0.5 | 250 | 0.0974 | - |
194
+ | 0.6 | 300 | 0.0238 | - |
195
+ | 0.7 | 350 | 0.025 | - |
196
+ | 0.8 | 400 | 0.0497 | - |
197
+ | 0.9 | 450 | 0.0227 | - |
198
+ | 1.0 | 500 | 0.1179 | - |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
 
200
  ### Framework Versions
201
  - Python: 3.10.12
202
  - SetFit: 1.0.3
203
  - Sentence Transformers: 3.0.1
204
  - Transformers: 4.39.0
205
+ - PyTorch: 2.3.1+cu121
206
  - Datasets: 2.20.0
207
  - Tokenizers: 0.15.2
208
 
config_sentence_transformers.json CHANGED
@@ -2,7 +2,7 @@
2
  "__version__": {
3
  "sentence_transformers": "3.0.1",
4
  "transformers": "4.39.0",
5
- "pytorch": "2.3.0+cu121"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
 
2
  "__version__": {
3
  "sentence_transformers": "3.0.1",
4
  "transformers": "4.39.0",
5
+ "pytorch": "2.3.1+cu121"
6
  },
7
  "prompts": {},
8
  "default_prompt_name": null,
config_setfit.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "normalize_embeddings": false,
3
- "labels": null
4
  }
 
1
  {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b4b2e1409e85db112f45aebf4ee858201e09354898f68fd2b35a468a74d97a61
3
  size 437967672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e41900dea02c1c3a7334b67354e5fee53c628344ac880ade9dfb61bad97cea0a
3
  size 437967672
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b8dfba481443ce98ee926990619e2c7d0a8d3d380c0ecb23b758a28716945513
3
  size 26916
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a7977aed629ae22bcdd0b7c348817a6cf8ba615d4b0464851c05e304b392410d
3
  size 26916