Bitext
commited on
Commit
•
96dcfd2
1
Parent(s):
18ebf54
Create README.md
Browse filesUpdate README.md detailing the fine-tuned model's training data, architecture, and intended use.
README.md
ADDED
@@ -0,0 +1,90 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- axolotl
|
5 |
+
- generated_from_trainer
|
6 |
+
- text-generation-inference
|
7 |
+
base_model: mistralai/Mistral-7B-Instruct-v0.2
|
8 |
+
model_type: mistral
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
model-index:
|
11 |
+
- name: Mistral-7B-Mortgage-Loans-v2
|
12 |
+
results: []
|
13 |
+
---
|
14 |
+
|
15 |
+
# Mistral-7B-Mortgage-Loans-v2
|
16 |
+
|
17 |
+
## Model Description
|
18 |
+
|
19 |
+
This model, "Mistral-7B-Mortgage-Loans-v2," is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) developed to specifically address queries related to mortgage and loans. It provides answers that are crucial for understanding complex loan processes and mortgage applications.
|
20 |
+
|
21 |
+
## Intended Use
|
22 |
+
|
23 |
+
- **Recommended applications**: This model is particularly useful for financial institutions, mortgage brokers, and loan providers. It is designed to integrate into customer support systems to help users understand their loan options, mortgage details, and payment plans.
|
24 |
+
- **Out-of-scope**: This model is not designed for non-financial inquiries and should not be used to provide legal, medical, or other advice outside of its financial expertise area.
|
25 |
+
|
26 |
+
## Usage Example
|
27 |
+
|
28 |
+
```python
|
29 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
30 |
+
|
31 |
+
model = AutoModelForCausalLM.from_pretrained("bitext-llm/Mistral-7B-Mortgage-Loans-v2")
|
32 |
+
tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Mortgage-Loans-v2")
|
33 |
+
|
34 |
+
inputs = tokenizer("<s>[INST] What are the requirements for a home loan? [/INST]", return_tensors="pt")
|
35 |
+
outputs = model.generate(inputs['input_ids'], max_length=50)
|
36 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
37 |
+
```
|
38 |
+
|
39 |
+
## Model Architecture
|
40 |
+
|
41 |
+
The model utilizes the `MistralForCausalLM` architecture along with a `LlamaTokenizer`. It retains the fundamental characteristics of the base model while being optimized to understand and generate responses in the context of mortgage and loans.
|
42 |
+
|
43 |
+
## Training Data
|
44 |
+
|
45 |
+
The model was trained on a dataset specifically designed for the mortgage and loan sector, featuring 39 intents including `apply_for_loan`, `check_loan_terms`, `refinance_loan`, `customer_service`, and many others, each with nearly 1000 examples. This rich dataset ensures the model's proficiency in addressing a broad spectrum of inquiries within this domain.
|
46 |
+
|
47 |
+
## Training Procedure
|
48 |
+
|
49 |
+
### Hyperparameters
|
50 |
+
|
51 |
+
- **Optimizer**: AdamW
|
52 |
+
- **Learning Rate**: 0.0002
|
53 |
+
- **Epochs**: 1
|
54 |
+
- **Batch Size**: 8
|
55 |
+
- **Gradient Accumulation Steps**: 4
|
56 |
+
- **Maximum Sequence Length**: 1024 tokens
|
57 |
+
|
58 |
+
### Environment
|
59 |
+
|
60 |
+
- **Transformers Version**: 4.40.0.dev0
|
61 |
+
- **Framework**: PyTorch 2.2.1+cu121
|
62 |
+
- **Tokenizers**: Tokenizers 0.15.0
|
63 |
+
|
64 |
+
## Limitations and Bias
|
65 |
+
|
66 |
+
- The model is fine-tuned on a domain-specific dataset and may not perform well outside the scope of financial advice.
|
67 |
+
- Users should be aware of potential biases in the training data, as the model's responses may inadvertently reflect these biases. This model has been trained with a dataset that answers general wealth management questions, so potential biases may exist for specific use cases.
|
68 |
+
|
69 |
+
## Ethical Considerations
|
70 |
+
|
71 |
+
This model should be used responsibly, considering ethical implications of automated financial advice. As it is a base model for this financial field, it is crucial to ensure that the model's advice complements human expertise and adheres to relevant financial regulations.
|
72 |
+
|
73 |
+
## Acknowledgments
|
74 |
+
|
75 |
+
This model was developed by the Bitext and trained on infrastructure provided by Bitext.
|
76 |
+
|
77 |
+
## License
|
78 |
+
|
79 |
+
This model, "Mistral-7B-Mortgage-Loans-v2", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open-source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext.
|
80 |
+
|
81 |
+
### Key Points of the Apache 2.0 License
|
82 |
+
|
83 |
+
- **Permissibility**: Users are allowed to use, modify, and distribute this software freely.
|
84 |
+
- **Attribution**: You must provide proper credit to Bitext Innovations International, Inc. when using this model, in accordance with the original copyright notices and the license.
|
85 |
+
- **Patent Grant**: The license includes a grant of patent rights from the contributors of the model.
|
86 |
+
- **No Warranty**: The model is provided "as is" without warranties of any kind.
|
87 |
+
|
88 |
+
You may view the full license text at [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0).
|
89 |
+
|
90 |
+
This licensing ensures the model can be used widely and freely while respecting the intellectual contributions of Bitext. For more detailed information or specific legal questions about using this license, please refer to the official license documentation linked above.
|