--- license: apache-2.0 library_name: peft tags: - trl - sft - generated_from_trainer base_model: mistralai/Mistral-7B-v0.1 model-index: - name: mistral-finetune-long results: [] --- # mistral-finetune-long This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1). It achieves the following results on the evaluation set: - Loss: 1.8378 ## Model description This model is fine-tuned to specialize in generating content related to the environment and sustainability domain. The training involved Supervised Fine-Tuning (SFT), Parameter Efficient Fine-Tuning (PEFT), and Low-Rank Adaptation (LoRA) techniques to optimize model performance. The motivation behind this research is to explore the feasibility and effectiveness of Semantically Sufficient Private Large Language Models (LLMs) for secure, domain-specific knowledge extraction in the context of environment and sustainability. ## Intended uses The model is intended for information retrieval and knowledge extraction tasks within the domain of environment and sustainability. ## Training and evaluation data The training data consists of domain-specific text collected from Wikipedia pages related to environmental topics. This model was trained using the Long dataset. [Model trained with the Short dataset](https://huggingface.co/fionazhang/mistral-finetune-short). | **Dataset** | **URLs** | **Number of Rows** | **Number of Words** | **Number of Sentences** | |-------------|----------|--------------------|----------------------|--------------------------| | Short | 11 | 577 | 51,526 | 2,150 | | Long | 23 | 1,431 | 124,682 | 5,209 | **Table 1:** Summary of Dataset Information ### Environment and Sustainability This model is tailored for the environment and sustainability domain, with a focus on assisting researchers and enterprises, particularly in alignment with the work of the Commonwealth Scientific and Industrial Research Organisation (CSIRO). ### Data Collection Process The training data was collected through a Python program that extracted and cleaned text content from specific Wikipedia pages related to environmental topics. The program utilized various libraries, such as `requests`, `BeautifulSoup`, and `nltk`, for efficient web scraping, HTML parsing, and natural language processing. ## Training procedure ## Fine-tuning The fine-tuning process involved Soft Fine-Tuning, PEFT, and LoRA techniques. Soft Fine-Tuning utilized continuous-valued probabilities as labels, suitable for generation models. PEFT focused on updating a small subset of parameters during fine-tuning to prevent catastrophic forgetting. LoRA, a lightweight training technique, reduced the number of trainable parameters for faster and memory-efficient training. #### Low-Rank Adaptation (LoRA) Parameters - lora_alpha: 16 - lora_dropout: 0.1 - r: 8 #### Training Parameters - num_train_epochs: 2 - per_device_train_batch_size: 3 - per_device_eval_batch_size: 3 - gradient_accumulation_steps: 1 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - learning_rate: 5e-05 - weight_decay: 0.001 - max_grad_norm: 0.3 - max_steps: -1 - warmup_ratio: 0.03 - group_by_length: True - lr_scheduler_type: constant - seed: 42 - num_epochs: 2 ### Training results #### Training Loss ![Loss](https://huggingface.co/fionazhang/mistral-finetune-long/blob/main/loss-long.png) *Figure 1: Training loss curve of model fionazhang/mistral-finetune-long (logging step = 1)* In the training process, the observed training losses exhibit jittery yet overall decreasing trends. The final evaluation loss reaches a satisfactory value of 2.0377, indicating successful learning and adaptation to the nuances of the provided data. ### Framework versions - PEFT 0.7.1 - Transformers 4.36.2 - Pytorch 2.1.0a0+git7bcf7da - Datasets 2.16.1 - Tokenizers 0.15.0