Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,18 @@ tags:
|
|
6 |
model-index:
|
7 |
- name: SECTOR-multilabel-climatebert
|
8 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -13,7 +25,9 @@ should probably proofread and complete it, then remove this comment. -->
|
|
13 |
|
14 |
# SECTOR-multilabel-climatebert
|
15 |
|
16 |
-
This model is a fine-tuned version of [climatebert/distilroberta-base-climate-f](https://huggingface.co/climatebert/distilroberta-base-climate-f) on the
|
|
|
|
|
17 |
It achieves the following results on the evaluation set:
|
18 |
- Loss: 0.6028
|
19 |
- Precision-micro: 0.6395
|
@@ -28,7 +42,9 @@ It achieves the following results on the evaluation set:
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
-
|
|
|
|
|
32 |
|
33 |
## Intended uses & limitations
|
34 |
|
@@ -36,7 +52,21 @@ More information needed
|
|
36 |
|
37 |
## Training and evaluation data
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
## Training procedure
|
42 |
|
@@ -64,10 +94,28 @@ The following hyperparameters were used during training:
|
|
64 |
| 0.1359 | 6.0 | 3798 | 0.5913 | 0.6349 | 0.7506 | 0.6449 | 0.7844 | 0.8676 | 0.7844 | 0.7018 | 0.7667 | 0.7057 |
|
65 |
| 0.1133 | 7.0 | 4431 | 0.6028 | 0.6395 | 0.7543 | 0.6475 | 0.7762 | 0.8583 | 0.7762 | 0.7012 | 0.7655 | 0.7041 |
|
66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
### Framework versions
|
69 |
|
70 |
- Transformers 4.38.1
|
71 |
- Pytorch 2.1.0+cu121
|
72 |
- Datasets 2.18.0
|
73 |
-
- Tokenizers 0.15.2
|
|
|
6 |
model-index:
|
7 |
- name: SECTOR-multilabel-climatebert
|
8 |
results: []
|
9 |
+
datasets:
|
10 |
+
- GIZ/policy_classification
|
11 |
+
|
12 |
+
co2_eq_emissions:
|
13 |
+
emissions: 23.3572576873636
|
14 |
+
source: codecarbon
|
15 |
+
training_type: fine-tuning
|
16 |
+
on_cloud: true
|
17 |
+
cpu_model: Intel(R) Xeon(R) CPU @ 2.00GHz
|
18 |
+
ram_total_size: 12.6747894287109
|
19 |
+
hours_used: 0.529
|
20 |
+
hardware_used: 1 x Tesla T4
|
21 |
---
|
22 |
|
23 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
25 |
|
26 |
# SECTOR-multilabel-climatebert
|
27 |
|
28 |
+
This model is a fine-tuned version of [climatebert/distilroberta-base-climate-f](https://huggingface.co/climatebert/distilroberta-base-climate-f) on the [Policy-Classification](https://huggingface.co/datasets/GIZ/policy_classification) dataset.
|
29 |
+
|
30 |
+
*The loss function BCEWithLogitsLoss is modified with pos_weight to focus on recall, therefore instead of loss the evaluation metrics are used to assess the model performance during training*
|
31 |
It achieves the following results on the evaluation set:
|
32 |
- Loss: 0.6028
|
33 |
- Precision-micro: 0.6395
|
|
|
42 |
|
43 |
## Model description
|
44 |
|
45 |
+
The purpose of this model is to predict multiple labels simultaneously from a given input data. Specifically, the model will predict Sector labels - Agriculture,Buildings,
|
46 |
+
Coastal Zone,Cross-Cutting Area,Disaster Risk Management (DRM),Economy-wide,Education,Energy,Environment,Health,Industries,LULUCF/Forestry,Social Development,Tourism,
|
47 |
+
Transport,Urban,Waste,Water
|
48 |
|
49 |
## Intended uses & limitations
|
50 |
|
|
|
52 |
|
53 |
## Training and evaluation data
|
54 |
|
55 |
+
- Training Dataset: 10031
|
56 |
+
| Class | Positive Count of Class|
|
57 |
+
|:-------------|:--------|
|
58 |
+
| Action | 5416 |
|
59 |
+
| Plans | 2140 |
|
60 |
+
| Policy | 1396|
|
61 |
+
| Target | 2911 |
|
62 |
+
|
63 |
+
- Validation Dataset: 932
|
64 |
+
| Class | Positive Count of Class|
|
65 |
+
|:-------------|:--------|
|
66 |
+
| Action | 513 |
|
67 |
+
| Plans | 198 |
|
68 |
+
| Policy | 122 |
|
69 |
+
| Target | 256 |
|
70 |
|
71 |
## Training procedure
|
72 |
|
|
|
94 |
| 0.1359 | 6.0 | 3798 | 0.5913 | 0.6349 | 0.7506 | 0.6449 | 0.7844 | 0.8676 | 0.7844 | 0.7018 | 0.7667 | 0.7057 |
|
95 |
| 0.1133 | 7.0 | 4431 | 0.6028 | 0.6395 | 0.7543 | 0.6475 | 0.7762 | 0.8583 | 0.7762 | 0.7012 | 0.7655 | 0.7041 |
|
96 |
|
97 |
+
|label | precision |recall |f1-score| support|
|
98 |
+
|:-------------:|:---------:|:-----:|:------:|:------:|
|
99 |
+
|Action |0.828 |0.807 |0.817 | 513.0 |
|
100 |
+
|Plans |0.560 |0.707 |0.625 | 198.0 |
|
101 |
+
|Policy |0.727 |0.786 |0.756 | 122.0 |
|
102 |
+
|Target |0.741 |0.886 |0.808 | 256.0 |
|
103 |
+
|
104 |
+
### Environmental Impact
|
105 |
+
Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
|
106 |
+
- **Carbon Emitted**: 0.02335 kg of CO2
|
107 |
+
- **Hours Used**: 0.529 hours
|
108 |
+
|
109 |
+
### Training Hardware
|
110 |
+
- **On Cloud**: yes
|
111 |
+
- **GPU Model**: 1 x Tesla T4
|
112 |
+
- **CPU Model**: Intel(R) Xeon(R) CPU @ 2.00GHz
|
113 |
+
- **RAM Size**: 12.67 GB
|
114 |
+
|
115 |
|
116 |
### Framework versions
|
117 |
|
118 |
- Transformers 4.38.1
|
119 |
- Pytorch 2.1.0+cu121
|
120 |
- Datasets 2.18.0
|
121 |
+
- Tokenizers 0.15.2
|