--- license: apache-2.0 tags: - axolotl - generated_from_trainer - text-generation-inference base_model: mistralai/Mistral-7B-Instruct-v0.2 model_type: mistral pipeline_tag: text-generation model-index: - name: Mistral-7B-Mortgage-Loans-v1 results: [] --- # Mistral-7B-Mortgage-Loans-v1 ## Model Description This model, "Mistral-7B-Mortgage-Loans-v1," is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) developed to specifically address queries related to mortgage and loans. It provides answers that are crucial for understanding complex loan processes and mortgage applications. ## Intended Use - **Recommended applications**: This model is particularly useful for financial institutions, mortgage brokers, and loan providers. It is designed to integrate into customer support systems to help users understand their loan options, mortgage details, and payment plans. - **Out-of-scope**: This model is not designed for non-financial inquiries and should not be used to provide legal, medical, or other advice outside of its financial expertise area. ## Usage Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("bitext-llm/Mistral-7B-Mortgage-Loans-v1") tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Mortgage-Loans-v1") inputs = tokenizer("[INST] What are the requirements for a home loan? [/INST]", return_tensors="pt") outputs = model.generate(inputs['input_ids'], max_length=50) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Model Architecture The model utilizes the `MistralForCausalLM` architecture along with a `LlamaTokenizer`. It retains the fundamental characteristics of the base model while being optimized to understand and generate responses in the context of mortgage and loans. ## Training Data The model was trained on a dataset specifically designed for the mortgage and loan sector, featuring 39 intents including `apply_for_loan`, `check_loan_terms`, `refinance_loan`, `customer_service`, and many others, each with nearly 1000 examples. This rich dataset ensures the model's proficiency in addressing a broad spectrum of inquiries within this domain. The dataset follows the same structured approach as our dataset published on Hugging Face as [bitext/Bitext-customer-support-llm-chatbot-training-dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset), but with a focus on mortgage and loans. ## Training Procedure ### Hyperparameters - **Optimizer**: AdamW - **Learning Rate**: 0.0002 with a cosine learning rate scheduler - **Epochs**: 4 - **Batch Size**: 10 - **Gradient Accumulation Steps**: 8 - **Maximum Sequence Length**: 8192 tokens ### Environment - **Transformers Version**: 4.40.0.dev0 - **Framework**: PyTorch 2.2.1+cu121 - **Tokenizers**: Tokenizers 0.15.0 ## Limitations and Bias - The model is fine-tuned on a domain-specific dataset and may not perform well outside the scope of financial advice. - Users should be aware of potential biases in the training data, as the model's responses may inadvertently reflect these biases. This model has been trained with a dataset that answers general mortgage and loans questions, so potential biases may exist for specific use cases. ## Ethical Considerations This model should be used responsibly, considering ethical implications of automated financial advice. As it is a base model for this financial field, it is crucial to ensure that the model's advice complements human expertise and adheres to relevant financial regulations. ## Acknowledgments This model was developed by the Bitext and trained on infrastructure provided by Bitext. ## License This model, "Mistral-7B-Mortgage-Loans-v1", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open-source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext. ### Key Points of the Apache 2.0 License - **Permissibility**: Users are allowed to use, modify, and distribute this software freely. - **Attribution**: You must provide proper credit to Bitext Innovations International, Inc. when using this model, in accordance with the original copyright notices and the license. - **Patent Grant**: The license includes a grant of patent rights from the contributors of the model. - **No Warranty**: The model is provided "as is" without warranties of any kind. You may view the full license text at [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0). This licensing ensures the model can be used widely and freely while respecting the intellectual contributions of Bitext. For more detailed information or specific legal questions about using this license, please refer to the official license documentation linked above.