--- license: apache-2.0 library_name: peft tags: - generated_from_trainer base_model: distilbert-base-multilingual-cased metrics: - accuracy - f1 - precision - recall model-index: - name: multilabel_lora_distilbert_runews_classifier_tuned results: [] datasets: - pyteach237/news_classify language: - ru - fr - en --- # # Model Card: DistilBERT with LoRA for Text Classification ## Model Details **Model Name:** DistilBERT with LoRA for Text Classification **Model Type:** Transformer-based Language Model **Base Model:** `distilbert-base-multilingual-cased` **Fine-tuning Framework:** LoRA (Low-Rank Adaptation of Large Language Models) **Trained By:** ABODO Brice Donald **License:** Apache 2.0 This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on the None dataset. It achieves the following results on the evaluation set: - Loss: 1.0019 - Accuracy: 0.8276 - F1: 0.8284 - Precision: 0.8317 - Recall: 0.8276 ## Model description This model is a fine-tuned version of `distilbert-base-multilingual-cased` for text classification tasks. The model has been adapted using LoRA (Low-Rank Adaptation) to efficiently train on the target dataset with fewer parameters, allowing for better performance with less computational resources. ## Intended uses & limitations The model was trained and evaluated on the Russian Language news dataset, which consists of news texts labeled as positive, negative or neutral. The dataset is divided into training and test sets for evaluation purposes. ### Intended Use This model is intended for text classification tasks, particularly multilabel sentiment analysis. It can be fine-tuned further for other classification tasks by using appropriate datasets and modifying the number of labels. ### Limitations and Risks - **Bias:** The model may inherit biases present in the training data. - **Generalization:** Performance may vary on datasets with different distributions from the training data. - **Resource Usage:** Although more efficient than larger models, fine-tuning and inference still require significant computational resources. ## Training and evaluation data The model was evaluated using the following metrics: - **Accuracy:** Measures the fraction of correct predictions. - **F1 Score:** Harmonic mean of precision and recall. - **Precision:** Proportion of positive identifications that are actually correct. - **Recall:** Proportion of actual positives that are correctly identified. ## Training procedure ### Preprocessing - Tokenization: The text data was tokenized using the `DistilBertTokenizer` with a maximum length of 512 tokens. - Padding and Truncation: Applied to ensure uniform input size. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0009143508688456378 - train_batch_size: 32 - eval_batch_size: 32 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 7 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:| | No log | 1.0 | 91 | 0.5987 | 0.7634 | 0.7621 | 0.7648 | 0.7634 | | No log | 2.0 | 182 | 0.3768 | 0.8693 | 0.8698 | 0.8767 | 0.8693 | | No log | 3.0 | 273 | 0.2620 | 0.9065 | 0.9063 | 0.9093 | 0.9065 | | No log | 4.0 | 364 | 0.2427 | 0.9202 | 0.9203 | 0.9220 | 0.9202 | | No log | 5.0 | 455 | 0.2244 | 0.9367 | 0.9369 | 0.9387 | 0.9367 | | 0.3641 | 6.0 | 546 | 0.2385 | 0.9491 | 0.9491 | 0.9495 | 0.9491 | | 0.3641 | 7.0 | 637 | 0.2560 | 0.9464 | 0.9464 | 0.9465 | 0.9464 | ## How to Use ``` from transformers import DistilBertTokenizer, DistilBertForSequenceClassification, Trainer, TrainingArguments from peft import PeftConfig, PeftModel # Load the tokenizer and model tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-multilingual-cased') model_id = 'pyteach237/multilabel_lora_distilbert_runews_classifier_tuned' config = PeftConfig.from_pretrained(model_id) # Define the model with LoRA model = DistilBertForSequenceClassification.from_pretrained( config.base_model_name_or_path, num_labels=3 ) model = PeftModel.from_pretrained(model, model_id, config=config) text = "Your text here :)" # Tokenize input inputs = tokenizer(text, return_tensors="pt", truncation=True, padding='max_length', max_length=512) # Make predictions outputs = model(**inputs) predictions = outputs.logits.argmax(dim=-1) # Convert predictions to labels labels = ['negative', 'neutral', 'positive'] predicted_label = labels[predictions.item()] print(f'Predicted label: {predicted_label}') ``` ## Acknowledgements This model card template was inspired by the Hugging Face model cards. Special thanks to the contributors of the Hugging Face `transformers` library and the LoRA adaptation framework. ## Contact Information For further information, please contact [Brice Donald] at [b.donald.riced@protonmail.com]. ### Framework versions - PEFT 0.11.1 - Transformers 4.41.2 - Pytorch 2.1.2 - Datasets 2.19.2 - Tokenizers 0.19.1