upload model performance
Browse files
README.md
CHANGED
@@ -21,28 +21,28 @@ model-index:
|
|
21 |
revision: 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6
|
22 |
metrics:
|
23 |
- type: accuracy
|
24 |
-
value: 0.
|
25 |
name: Accuracy 'Bezeichnung'
|
26 |
- type: precision
|
27 |
-
value: 0.
|
28 |
name: Precision 'Bezeichnung' (macro)
|
29 |
- type: recall
|
30 |
-
value: 0.
|
31 |
name: Recall 'Bezeichnung' (macro)
|
32 |
- type: f1
|
33 |
-
value: 0.
|
34 |
name: Recall 'Bezeichnung' (macro)
|
35 |
- type: accuracy
|
36 |
-
value: 0.
|
37 |
name: Accuracy 'Thema'
|
38 |
- type: precision
|
39 |
-
value: 0.
|
40 |
name: Precision 'Thema' (macro)
|
41 |
- type: recall
|
42 |
-
value: 0.
|
43 |
name: Recall 'Thema' (macro)
|
44 |
- type: f1
|
45 |
-
value: 0.
|
46 |
name: Recall 'Thema' (macro)
|
47 |
---
|
48 |
|
@@ -61,7 +61,7 @@ This model is based on bert-base-german-cased and fine-tuned on and-effect/mdk_g
|
|
61 |
- **License:** [More Information Needed]
|
62 |
- **Finetuned from model:** "bert-base-german-case. For more information one the model check on [this model card](https://huggingface.co/bert-base-german-cased)"
|
63 |
|
64 |
-
## Model Sources
|
65 |
|
66 |
<!-- Provide the basic links for the model. -->
|
67 |
|
@@ -166,8 +166,8 @@ The model is fine tuned with similar and dissimilar pairs. Similar pairs are bui
|
|
166 |
|
167 |
| pairs | size |
|
168 |
|-----|-----|
|
169 |
-
| train_similar_pairs |
|
170 |
-
| train_unsimilar_pairs |
|
171 |
| test_similar_pairs | 498 |
|
172 |
| test_unsimilar_pairs | 249 |
|
173 |
|
@@ -179,13 +179,13 @@ The model was trained with the parameters:
|
|
179 |
`torch.utils.data.dataloader.DataLoader`
|
180 |
|
181 |
**Loss**:
|
182 |
-
`sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
|
183 |
|
184 |
Hyperparameter:
|
185 |
```
|
186 |
{
|
187 |
"epochs": 3,
|
188 |
-
"
|
189 |
}
|
190 |
```
|
191 |
|
@@ -198,7 +198,7 @@ Hyperparameter:
|
|
198 |
|
199 |
# Evaluation
|
200 |
|
201 |
-
All metrices express the models ability to classify dataset titles from GOVDATA into the taxonomy described [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). For more information see VERLINKUNG MDK Projekt.
|
202 |
|
203 |
## Testing Data, Factors & Metrics
|
204 |
|
@@ -214,12 +214,12 @@ The model performance is tested with fours metrices. Accuracy, Precision, Recall
|
|
214 |
|
215 |
| ***task*** | ***acccuracy*** | ***precision (macro)*** | ***recall (macro)*** | ***f1 (macro)*** |
|
216 |
|-----|-----|-----|-----|-----|
|
217 |
-
| Test dataset 'Bezeichnung' I | 0.
|
218 |
-
| Test dataset 'Thema' I | 0.
|
219 |
-
| Test dataset 'Bezeichnung' II | 0.
|
220 |
| Validation dataset 'Bezeichnung' I | 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 |
|
221 |
| Validation dataset 'Thema' I | 0.801980198019802 | 0.6433080808080808 | 0.7039711632453568 | 0.6591710279769981 |
|
222 |
-
| Validation dataset 'Bezeichnung' II | 0.5445544554455446 | 0.
|
223 |
|
224 |
|
225 |
### Summary
|
|
|
21 |
revision: 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6
|
22 |
metrics:
|
23 |
- type: accuracy
|
24 |
+
value: 0.6762295081967213
|
25 |
name: Accuracy 'Bezeichnung'
|
26 |
- type: precision
|
27 |
+
value: 0.5688091249507292
|
28 |
name: Precision 'Bezeichnung' (macro)
|
29 |
- type: recall
|
30 |
+
value: 0.5981436148510813
|
31 |
name: Recall 'Bezeichnung' (macro)
|
32 |
- type: f1
|
33 |
+
value: 0.5693466048057273
|
34 |
name: Recall 'Bezeichnung' (macro)
|
35 |
- type: accuracy
|
36 |
+
value: 0.8934426229508197
|
37 |
name: Accuracy 'Thema'
|
38 |
- type: precision
|
39 |
+
value: 0.9258716898716898
|
40 |
name: Precision 'Thema' (macro)
|
41 |
- type: recall
|
42 |
+
value: 0.8669105248121641
|
43 |
name: Recall 'Thema' (macro)
|
44 |
- type: f1
|
45 |
+
value: 0.8632335412054082
|
46 |
name: Recall 'Thema' (macro)
|
47 |
---
|
48 |
|
|
|
61 |
- **License:** [More Information Needed]
|
62 |
- **Finetuned from model:** "bert-base-german-case. For more information one the model check on [this model card](https://huggingface.co/bert-base-german-cased)"
|
63 |
|
64 |
+
## Model Sources
|
65 |
|
66 |
<!-- Provide the basic links for the model. -->
|
67 |
|
|
|
166 |
|
167 |
| pairs | size |
|
168 |
|-----|-----|
|
169 |
+
| train_similar_pairs | 1964 |
|
170 |
+
| train_unsimilar_pairs | 982 |
|
171 |
| test_similar_pairs | 498 |
|
172 |
| test_unsimilar_pairs | 249 |
|
173 |
|
|
|
179 |
`torch.utils.data.dataloader.DataLoader`
|
180 |
|
181 |
**Loss**:
|
182 |
+
`sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
|
183 |
|
184 |
Hyperparameter:
|
185 |
```
|
186 |
{
|
187 |
"epochs": 3,
|
188 |
+
"warmup_steps": 100,
|
189 |
}
|
190 |
```
|
191 |
|
|
|
198 |
|
199 |
# Evaluation
|
200 |
|
201 |
+
All metrices express the models ability to classify dataset titles from GOVDATA into the taxonomy described [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). For more information see VERLINKUNG MDK Projekt.
|
202 |
|
203 |
## Testing Data, Factors & Metrics
|
204 |
|
|
|
214 |
|
215 |
| ***task*** | ***acccuracy*** | ***precision (macro)*** | ***recall (macro)*** | ***f1 (macro)*** |
|
216 |
|-----|-----|-----|-----|-----|
|
217 |
+
| Test dataset 'Bezeichnung' I | 0.6762295081967213 | 0.5688091249507292 | 0.5981436148510813 | 0.5693466048057273 |
|
218 |
+
| Test dataset 'Thema' I | 0.8934426229508197 | 0.9258716898716898 | 0.8669105248121641 | 0.8632335412054082 |
|
219 |
+
| Test dataset 'Bezeichnung' II | 0.6762295081967213 | 0.5598761408083442 | 0.7875393612235718 | 0.6306226331603018 |
|
220 |
| Validation dataset 'Bezeichnung' I | 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 |
|
221 |
| Validation dataset 'Thema' I | 0.801980198019802 | 0.6433080808080808 | 0.7039711632453568 | 0.6591710279769981 |
|
222 |
+
| Validation dataset 'Bezeichnung' II | 0.5445544554455446 | 0.6018518518518517 | 0.6278409090909091 | 0.6066776135741653 |
|
223 |
|
224 |
|
225 |
### Summary
|