--- base_model: mistralai/Mistral-7B-Instruct-v0.1 datasets: - generator - Anthropic/hh-rlhf library_name: peft license: apache-2.0 tags: - trl - sft - generated_from_trainer model-index: - name: Mistral-7B-text-to-RLHF results: [] --- # Mistral-7B-text-to-RLHF This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the generator dataset [Anthropic/hh-rlhf](https://huggingface.co/datasets/Anthropic/hh-rlhf). It achieves the following results on the evaluation set: - Loss: 0.7952 ## Model description [Human-in-the-Loop Fine-tuning of Mistral-7B for Enhanced Text Generation and Text-to-SQL](https://medium.com/@frankmorales_91352/human-in-the-loop-fine-tuning-of-mistral-7b-for-enhanced-text-generation-and-text-to-sql-23b06738af42) ## Training data [Full Code - Fine-Tunning with Supervised Fine-tuning (SFT) GITHUB](https://github.com/frank-morales2020/MLxDL/blob/main/FineTuning_LLM_Mistral_7B_Instruct_v0_1_for_text_to_RLHF.ipynb) ## Evaluation data [Human-in-the-Loop Fine-tuning of Mistral-7B for Enhanced Text Generation and Text-to-SQL](https://medium.com/@frankmorales_91352/human-in-the-loop-fine-tuning-of-mistral-7b-for-enhanced-text-generation-and-text-to-sql-23b06738af42) [Full Code GITHUB](https://github.com/frank-morales2020/MLxDL/blob/main/EVAL_RLHF.ipynb) ```pycL7bOWT3UO1lNBqMWxh8#scrollTo=YWT-g8NX_ezq from accelerate import Accelerator from transformers import AutoTokenizer, AutoModelForSequenceClassification, BitsAndBytesConfig #Initialize the accelerator accelerator = Accelerator() #From my Hugging Face Repository model_id = 'frankmorales2020/Mistral-7B-text-to-RLHF' # BitsAndBytesConfig int-4 config (if used for your reward model) bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, ) # Load the reward model and tokenizer model = AutoModelForSequenceClassification.from_pretrained( model_id, num_labels=1, device_map="auto", torch_dtype=torch.bfloat16, quantization_config=bnb_config, ) tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True) tokenizer.padding_side = "right" model.config.pad_token_id = tokenizer.pad_token_id # Test cases test_cases = [ ("What is the capital of France?", "Paris", "London"), ("Who painted the Mona Lisa?", "Leonardo da Vinci", "Michelangelo"), ("What is the largest planet in our solar system?", "Jupiter", "Mars"), ("What would you do if you saw someone drop their wallet?", "Pick it up and return it to them.", "Ignore it."), ("What color is the sky?", "Blue", "Green"), ("What is the chemical symbol for water?", "H2O", "CO2"), # Add more test cases here... ] def evaluate_example(prompt, chosen, rejected): inputs = tokenizer( [f"{prompt} {chosen}", f"{prompt} {rejected}"], return_tensors="pt", padding=True, ).to(accelerator.device) outputs = model(**inputs) chosen_score = outputs.logits[0].item() rejected_score = outputs.logits[1].item() print(f"Chosen score: {chosen_score}, Rejected score: {rejected_score}") return chosen_score > rejected_score correct_predictions = 0 total_reciprocal_rank = 0 for i, (prompt, chosen, rejected) in enumerate(test_cases): print("\n") print(f"Prompt: {prompt}, Chosen: {chosen}, Rejected: {rejected}") print("\n") if evaluate_example(prompt, chosen, rejected): print("Test passed!") correct_predictions += 1 total_reciprocal_rank += 1 else: print("Test failed.") total_reciprocal_rank += 0 # Incorrect prediction accuracy = correct_predictions / len(test_cases) mrr = total_reciprocal_rank / len(test_cases) print(f"\nOverall accuracy: {accuracy:.2f}") print(f"Mean Reciprocal Rank (MRR): {mrr:.2f}") ``` ```py Prompt: What is the capital of France?, Chosen: Paris, Rejected: London Chosen score: 3.890625, Rejected score: -15.375 Test passed! Prompt: Who painted the Mona Lisa?, Chosen: Leonardo da Vinci, Rejected: Michelangelo Chosen score: 6.0625, Rejected score: 4.1875 Test passed! Prompt: What is the largest planet in our solar system?, Chosen: Jupiter, Rejected: Mars Chosen score: 10.6875, Rejected score: 10.0625 Test passed! Prompt: What would you do if you saw someone drop their wallet?, Chosen: Pick it up and return it to them., Rejected: Ignore it. Chosen score: 3.140625, Rejected score: 0.13671875 Test passed! Prompt: What color is the sky?, Chosen: Blue, Rejected: Green Chosen score: 11.0625, Rejected score: 4.46875 Test passed! Prompt: What is the chemical symbol for water?, Chosen: H2O, Rejected: CO2 Chosen score: 0.42578125, Rejected score: -0.68359375 Test passed! Overall accuracy: 1.00 Mean Reciprocal Rank (MRR): 1.00 Number of questions used for MRR calculation: 6 ``` ## Training procedure https://github.com/frank-morales2020/MLxDL/blob/main/FineTuning_LLM_Mistral_7B_Instruct_v0_1_for_text_to_RLHF.ipynb ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 3 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 6 - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: constant - lr_scheduler_warmup_ratio: 0.03 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 1.7876 | 1.0 | 507 | 0.9024 | | 1.0272 | 2.0 | 1014 | 0.7952 | | 0.638 | 3.0 | 1521 | 0.8579 | ### Framework versions - PEFT 0.13.2 - Transformers 4.46.1 - Pytorch 2.5.0+cu121 - Datasets 3.0.2 - Tokenizers 0.20.1