and-effect
/

musterdatenkatalog_clf

Sentence Similarity

sentence-transformers

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

Rahka commited on Jan 27, 2023

Commit

b719141

·

1 Parent(s): 2c39ce6

add results

Files changed (1) hide show

README.md +9 -10

README.md CHANGED Viewed

@@ -211,25 +211,24 @@ Hyperparameter:
 <!-- This should link to a Data Card if possible. -->
-[More Information Needed]
-### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 ### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ## Results
-| accuracy | precision_macro | recall_macro | f1_macro |
-|-----|-----|-----|-----|
-| 0.7004405286343612 | 0.5717666948436179 | 0.6127063220180629 | 0.5805958812647776 |
 ### Summary

 <!-- This should link to a Data Card if possible. -->
+The evaluation data can be found [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). Since the model is trained on revision 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6 for evaluation, the evaluation metrics rely on the same revision.
 ### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
+The model performance is tested with fours metrices. Accuracy, Precision, Recall and F1 Score. A lot of classes were not predicted and are thus set to zero for the calculation of precision, recall and f1 score. For these metrices the additional calucations were performed exluding classes with less than two predictions for the level 'Bezeichnung' (see in table results 'Bezeichnung II'. Although intepretation of these results should be interpreted with caution, because they do not represent all classes.
 ## Results
+| accuracy | precision_macro | recall_macro | f1_macro | Task |
+|-----|-----|-----|-----|-----|
+| 0.7004405286343612 | 0.5717666948436179 | 0.6127063220180629 | 0.5805958812647776 | Test dataset Bezeichnung I |
+| 0.9162995594713657 | 0.9318954248366014 | 0.9122380952380952 | 0.8984289453766925 | Test dataset Thema I |
+| 0.7004405286343612 | 0.5730158730158731 | 0.8207602339181287 | 0.6515010351966873 | Test dataset Bezeichnung II |
+| 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 | Validation dataset Bezeichnung I |
+| 0.5445544554455446 | 0.6018518518518517 | 0.6278409090909091 | 0.6066776135741653 | Validation dataset Thema I |
 ### Summary