and-effect
/

musterdatenkatalog_clf

@@ -21,28 +21,28 @@ model-index:
       revision: 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6
     metrics:
     - type: accuracy
-      value: 0.7004405286343612
       name: Accuracy 'Bezeichnung'
     - type: precision
-      value: 0.5717666948436179
       name: Precision 'Bezeichnung' (macro)
     - type: recall
-      value: 0.6127063220180629
       name: Recall 'Bezeichnung' (macro)
     - type: f1
-      value: 0.5805958812647776
       name: Recall 'Bezeichnung' (macro)
     - type: accuracy
-      value: 0.9162995594713657
       name: Accuracy 'Thema'
     - type: precision
-      value: 0.9318954248366014
       name: Precision 'Thema' (macro)
     - type: recall
-      value: 0.9122380952380952
       name: Recall 'Thema' (macro)
     - type: f1
-      value: 0.8984289453766925
       name: Recall 'Thema' (macro)
 ---
@@ -61,7 +61,7 @@ This model is based on bert-base-german-cased and fine-tuned on and-effect/mdk_g
 - **License:** [More Information Needed]
 - **Finetuned from model:** "bert-base-german-case. For more information one the model check on [this model card](https://huggingface.co/bert-base-german-cased)"
-## Model Sources
 <!-- Provide the basic links for the model. -->
@@ -166,8 +166,8 @@ The model is fine tuned with similar and dissimilar pairs. Similar pairs are bui
 | pairs | size |
 |-----|-----|
-| train_similar_pairs | 2018 |
-| train_unsimilar_pairs | 1009 |
 | test_similar_pairs | 498 |
 | test_unsimilar_pairs | 249 |
@@ -179,13 +179,13 @@ The model was trained with the parameters:
 `torch.utils.data.dataloader.DataLoader`
 **Loss**:
-`sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
 Hyperparameter:
 ```
 {
     "epochs": 3,
-    "warumup_steps": [More Information Needed],
 }
 ```
@@ -198,7 +198,7 @@ Hyperparameter:
 # Evaluation
-All metrices express the models ability to classify dataset titles from GOVDATA into the taxonomy described [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). For more information see VERLINKUNG MDK Projekt.
 ## Testing Data, Factors & Metrics
@@ -214,12 +214,12 @@ The model performance is tested with fours metrices. Accuracy, Precision, Recall
 | ***task*** | ***acccuracy*** | ***precision (macro)*** | ***recall (macro)*** | ***f1 (macro)*** |
 |-----|-----|-----|-----|-----|
-| Test dataset 'Bezeichnung' I | 0.7004405286343612 | 0.5717666948436179 | 0.6127063220180629 | 0.5805958812647776 |
-| Test dataset 'Thema' I | 0.9162995594713657 | 0.9318954248366014 | 0.9122380952380952 | 0.8984289453766925 |
-| Test dataset 'Bezeichnung' II | 0.7004405286343612 | 0.573015873015873 | 0.8207602339181287 | 0.6515010351966875 |
 | Validation dataset 'Bezeichnung' I | 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 |
 | Validation dataset 'Thema' I | 0.801980198019802 | 0.6433080808080808 | 0.7039711632453568 | 0.6591710279769981 |
-| Validation dataset 'Bezeichnung' II | 0.5445544554455446 | 0.6018518518518519 | 0.6278409090909091 | 0.6066776135741653 |
 ### Summary

       revision: 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6
     metrics:
     - type: accuracy
+      value: 0.6762295081967213
       name: Accuracy 'Bezeichnung'
     - type: precision
+      value: 0.5688091249507292
       name: Precision 'Bezeichnung' (macro)
     - type: recall
+      value: 0.5981436148510813
       name: Recall 'Bezeichnung' (macro)
     - type: f1
+      value: 0.5693466048057273
       name: Recall 'Bezeichnung' (macro)
     - type: accuracy
+      value: 0.8934426229508197
       name: Accuracy 'Thema'
     - type: precision
+      value: 0.9258716898716898
       name: Precision 'Thema' (macro)
     - type: recall
+      value: 0.8669105248121641
       name: Recall 'Thema' (macro)
     - type: f1
+      value: 0.8632335412054082
       name: Recall 'Thema' (macro)
 ---
 - **License:** [More Information Needed]
 - **Finetuned from model:** "bert-base-german-case. For more information one the model check on [this model card](https://huggingface.co/bert-base-german-cased)"
+## Model Sources
 <!-- Provide the basic links for the model. -->
 | pairs | size |
 |-----|-----|
+| train_similar_pairs | 1964 |
+| train_unsimilar_pairs | 982 |
 | test_similar_pairs | 498 |
 | test_unsimilar_pairs | 249 |
 `torch.utils.data.dataloader.DataLoader`
 **Loss**:
+`sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
 Hyperparameter:
 ```
 {
     "epochs": 3,
+    "warmup_steps": 100,
 }
 ```
 # Evaluation
+All metrices express the models ability to classify dataset titles from GOVDATA into the taxonomy described [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). For more information see VERLINKUNG MDK Projekt.
 ## Testing Data, Factors & Metrics
 | ***task*** | ***acccuracy*** | ***precision (macro)*** | ***recall (macro)*** | ***f1 (macro)*** |
 |-----|-----|-----|-----|-----|
+| Test dataset 'Bezeichnung' I | 0.6762295081967213 | 0.5688091249507292 | 0.5981436148510813 | 0.5693466048057273 |
+| Test dataset 'Thema' I | 0.8934426229508197 | 0.9258716898716898 | 0.8669105248121641 | 0.8632335412054082 |
+| Test dataset 'Bezeichnung' II | 0.6762295081967213 | 0.5598761408083442 | 0.7875393612235718 | 0.6306226331603018 |
 | Validation dataset 'Bezeichnung' I | 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 |
 | Validation dataset 'Thema' I | 0.801980198019802 | 0.6433080808080808 | 0.7039711632453568 | 0.6591710279769981 |
+| Validation dataset 'Bezeichnung' II | 0.5445544554455446 | 0.6018518518518517 | 0.6278409090909091 | 0.6066776135741653 |
 ### Summary