completed models card
Browse files
README.md
CHANGED
@@ -31,7 +31,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
31 |
**Model Type:** Transformer-based Language Model
|
32 |
**Base Model:** `distilbert-base-multilingual-cased`
|
33 |
**Fine-tuning Framework:** LoRA (Low-Rank Adaptation of Large Language Models)
|
34 |
-
**Trained By:** ABODO
|
35 |
**License:** Apache 2.0
|
36 |
|
37 |
This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on the None dataset.
|
@@ -44,18 +44,37 @@ It achieves the following results on the evaluation set:
|
|
44 |
|
45 |
## Model description
|
46 |
|
47 |
-
|
48 |
|
49 |
## Intended uses & limitations
|
50 |
|
51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
## Training and evaluation data
|
54 |
|
55 |
-
|
|
|
|
|
|
|
|
|
|
|
56 |
|
57 |
## Training procedure
|
58 |
|
|
|
|
|
|
|
|
|
|
|
59 |
### Training hyperparameters
|
60 |
|
61 |
The following hyperparameters were used during training:
|
@@ -79,6 +98,47 @@ The following hyperparameters were used during training:
|
|
79 |
| 0.3641 | 6.0 | 546 | 0.2385 | 0.9491 | 0.9491 | 0.9495 | 0.9491 |
|
80 |
| 0.3641 | 7.0 | 637 | 0.2560 | 0.9464 | 0.9464 | 0.9465 | 0.9464 |
|
81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
82 |
|
83 |
### Framework versions
|
84 |
|
|
|
31 |
**Model Type:** Transformer-based Language Model
|
32 |
**Base Model:** `distilbert-base-multilingual-cased`
|
33 |
**Fine-tuning Framework:** LoRA (Low-Rank Adaptation of Large Language Models)
|
34 |
+
**Trained By:** ABODO Brice Donald
|
35 |
**License:** Apache 2.0
|
36 |
|
37 |
This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on the None dataset.
|
|
|
44 |
|
45 |
## Model description
|
46 |
|
47 |
+
This model is a fine-tuned version of `distilbert-base-multilingual-cased` for text classification tasks. The model has been adapted using LoRA (Low-Rank Adaptation) to efficiently train on the target dataset with fewer parameters, allowing for better performance with less computational resources.
|
48 |
|
49 |
## Intended uses & limitations
|
50 |
|
51 |
+
The model was trained and evaluated on the Russian Language news dataset, which consists of news texts labeled as positive, negative or neutral. The dataset is divided into training and test sets for evaluation purposes.
|
52 |
+
### Intended Use
|
53 |
+
|
54 |
+
This model is intended for text classification tasks, particularly multilabel sentiment analysis. It can be fine-tuned further for other classification tasks by using appropriate datasets and modifying the number of labels.
|
55 |
+
|
56 |
+
### Limitations and Risks
|
57 |
+
|
58 |
+
- **Bias:** The model may inherit biases present in the training data.
|
59 |
+
- **Generalization:** Performance may vary on datasets with different distributions from the training data.
|
60 |
+
- **Resource Usage:** Although more efficient than larger models, fine-tuning and inference still require significant computational resources.
|
61 |
|
62 |
## Training and evaluation data
|
63 |
|
64 |
+
The model was evaluated using the following metrics:
|
65 |
+
|
66 |
+
- **Accuracy:** Measures the fraction of correct predictions.
|
67 |
+
- **F1 Score:** Harmonic mean of precision and recall.
|
68 |
+
- **Precision:** Proportion of positive identifications that are actually correct.
|
69 |
+
- **Recall:** Proportion of actual positives that are correctly identified.
|
70 |
|
71 |
## Training procedure
|
72 |
|
73 |
+
### Preprocessing
|
74 |
+
|
75 |
+
- Tokenization: The text data was tokenized using the `DistilBertTokenizer` with a maximum length of 512 tokens.
|
76 |
+
- Padding and Truncation: Applied to ensure uniform input size.
|
77 |
+
|
78 |
### Training hyperparameters
|
79 |
|
80 |
The following hyperparameters were used during training:
|
|
|
98 |
| 0.3641 | 6.0 | 546 | 0.2385 | 0.9491 | 0.9491 | 0.9495 | 0.9491 |
|
99 |
| 0.3641 | 7.0 | 637 | 0.2560 | 0.9464 | 0.9464 | 0.9465 | 0.9464 |
|
100 |
|
101 |
+
## How to Use
|
102 |
+
```
|
103 |
+
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification, Trainer, TrainingArguments
|
104 |
+
from peft import PeftConfig, PeftModel
|
105 |
+
|
106 |
+
# Load the tokenizer and model
|
107 |
+
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased')
|
108 |
+
model_id = 'pyteach237/multilabel_lora_distilbert_runews_classifier_tuned'
|
109 |
+
config = PeftConfig.from_pretrained(model_id)
|
110 |
+
|
111 |
+
# Define the model with LoRA
|
112 |
+
model = DistilBertForSequenceClassification.from_pretrained(
|
113 |
+
config.base_model_name_or_path,
|
114 |
+
num_labels=3
|
115 |
+
)
|
116 |
+
model = PeftModel.from_pretrained(model, model_id, config=config)
|
117 |
+
|
118 |
+
text = "Your text here :)"
|
119 |
+
|
120 |
+
# Tokenize input
|
121 |
+
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding='max_length', max_length=512)
|
122 |
+
|
123 |
+
# Make predictions
|
124 |
+
outputs = model(**inputs)
|
125 |
+
predictions = outputs.logits.argmax(dim=-1)
|
126 |
+
|
127 |
+
# Convert predictions to labels
|
128 |
+
labels = ['negative', 'neutral', 'positive']
|
129 |
+
predicted_label = labels[predictions.item()]
|
130 |
+
print(f'Predicted label: {predicted_label}')
|
131 |
+
|
132 |
+
```
|
133 |
+
|
134 |
+
## Acknowledgements
|
135 |
+
|
136 |
+
This model card template was inspired by the Hugging Face model cards. Special thanks to the contributors of the Hugging Face `transformers` library and the LoRA adaptation framework.
|
137 |
+
|
138 |
+
## Contact Information
|
139 |
+
|
140 |
+
For further information, please contact [Brice Donald] at [b.donald.riced@protonmail.com].
|
141 |
+
|
142 |
|
143 |
### Framework versions
|
144 |
|