DebertaV3ForAIS

Model Description

The model is based on the DeBERTa-v3 architecture, a transformer-based model that performs token classification tasks. It has been fine-tuned on a specific dataset to perform token classification with high accuracy.

Model Configuration

Model Name: AlGe AIS
Model Type: DeBERTa-v3
Transformers Version: 4.21.3

Model Parameters

Hidden Size: 1024
Intermediate Size: 4096
Number of Hidden Layers: 24
Number of Attention Heads: 16
Attention Dropout Probability: 0.1
Hidden Dropout Probability: 0.1
Hidden Activation Function: GELU
Pooler Hidden Size: 1024
Pooler Dropout Probability: 0
Layer Normalization Epsilon: 1e-07
Position Biased Input: False
Maximum Position Embeddings: 512
Maximum Relative Positions: -1
Position Attention Types: p2c, c2p
Relative Attention: True
Share Attention Key: True
Normalization of Relative Embeddings: Layer Normalization
Vocabulary Size: 128100
Padding Token ID: 0
Type Vocabulary Size: 0
Torch Data Type: float32
Transformers Version: 4.21.3

Training Details

The model was trained on a specific dataset with the following settings:

Sequence Length: 512
Label: True
Extended: True

Evaluation Results

Metric	Score
Accuracy	0.7558
F1 Micro	0.5719
F1 Macro	0.5201
F1 Weighted	0.5717
Cohen's Kappa	0.6852

Acknowledgments

This model was pretraine by the authors of DeBERTa-v3 and adapted for token classification tasks. We thank the authors for their contributions to the field of NLP and the Hugging Face team for providing the base DeBERTa-v3 model.

Disclaimer

The model card provides information about the specific configuration and training of the model. However, please note that the performance of the model may vary depending on the specific use case and input data. It is advisable to evaluate the model's performance in your specific context before deploying it in production.

AlGe
/

deberta_v3_large_token

You need to agree to share your contact information to access this model