Model description
This model is a fine-tuned version of coastalcph/danish-legal-longformer-base on the Danish part of MultiEURLEX dataset.
Training and evaluation data
The Danish part of MultiEURLEX dataset.
Use of Model
As a text classifier:
from transformers import pipeline
import numpy as np
# Init text classification pipeline
text_cls_pipe = pipeline(task="text-classification",
model="coastalcph/danish-legal-longformer-eurlex",
use_auth_token='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')
# Encode and Classify document
predictions = text_cls_pipe("KOMMISSIONENS BESLUTNING\naf 6. marts 2006\nom klassificering af visse byggevarers "
"ydeevne med hensyn til reaktion ved brand for så vidt angår trægulve samt vægpaneler "
"og vægbeklædning i massivt træ\n(meddelt under nummer K(2006) 655")
# Print prediction
print(predictions)
# [{'label': 'building and public works', 'score': 0.9626012444496155}]
As a feature extractor (document embedder):
from transformers import pipeline
import numpy as np
# Init feature extraction pipeline
feature_extraction_pipe = pipeline(task="feature-extraction",
model="coastalcph/danish-legal-longformer-eurlex",
use_auth_token='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')
# Encode document
predictions = feature_extraction_pipe("KOMMISSIONENS BESLUTNING\naf 6. marts 2006\nom klassificering af visse byggevarers "
"ydeevne med hensyn til reaktion ved brand for så vidt angår trægulve samt vægpaneler "
"og vægbeklædning i massivt træ\n(meddelt under nummer K(2006) 655")
# Use CLS token representation as document embedding
document_features = token_wise_features[0][0]
print(document_features.shape)
# (768,)
Framework versions
- Transformers 4.18.0
- Pytorch 1.12.0+cu113
- Datasets 2.0.0
- Tokenizers 0.12.1
- Downloads last month
- 94
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Dataset used to train coastalcph/danish-legal-longformer-eurlex
Evaluation results
- Micro-F1 on multi_eurlexvalidation set self-reported0.757
- Macro-F1 on multi_eurlexvalidation set self-reported0.529