Edit model card

Mistral-7B-Retail-Banking-v1

Model Description

This model, "Mistral-7B-Retail-Banking-v1," is a fine-tuned version of the mistralai/Mistral-7B-Instruct-v0.2, specifically tailored for the retail banking domain. It is optimized to answer questions and assist users with various banking transactions.

Intended Use

  • Recommended applications: This model is ideal for retail banks to use in chatbots, virtual assistants and copilots, providing customers with fast and accurate answers about their banking needs.
  • Out-of-scope: It should not be used for non-banking related inquiries or for providing advice on medical, legal, or critical safety issues.

Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bitext-llm/Mistral-7B-Retail-Banking-v1")
tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Retail-Banking-v1")

inputs = tokenizer("<s>[INST] How do I open a new bank account? [/INST]", return_tensors="pt")
outputs = model.generate(inputs['input_ids'], max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Model Architecture

The model uses the MistralForCausalLM architecture with a LlamaTokenizer. It retains the core features of the base model but is enhanced to address the specific needs of retail banking.

Training Data

The model was trained using a dataset designed for retail banking interactions, now publicly available on Hugging Face as bitext/Bitext-retail-banking-llm-chatbot-training-dataset. This dataset comprises 26 different intents such as check_balance, transfer_money, open_account, and more, each with around 1000 examples.

The dataset includes:

  • 25,545 question/answer pairs
  • 4.98 million tokens
  • 1224 entity/slot types

Each entry consists of:

  • Instruction: User request
  • Category: High-level semantic category
  • Intent: Specific intent of the user request
  • Response: Example response from a virtual assistant

The dataset covers a wide range of banking-related categories such as ACCOUNT, ATM, CARD, CONTACT, FEES, FIND, LOAN, PASSWORD, and TRANSFER, ensuring comprehensive training for handling diverse retail banking queries.

Training Procedure

Hyperparameters

  • Optimizer: AdamW
  • Learning Rate: 0.0002
  • Epochs: 1
  • Batch Size: 8
  • Gradient Accumulation Steps: 4
  • Maximum Sequence Length: 1024 tokens

Environment

  • Transformers Version: 4.40.0.dev0
  • Framework: PyTorch 2.2.1+cu121
  • Tokenizers: Tokenizers 0.15.0

Limitations and Bias

  • The model is specifically trained for retail banking and may not yield accurate results outside this field.
  • There may be biases in the training data, which could influence the model's responses, particularly in nuanced financial scenarios.

Ethical Considerations

Care should be taken to ensure that the model's automated responses do not replace professional human advice where necessary, particularly in complex financial situations.

Acknowledgments

This model was developed and trained by Bitext using their proprietary technologies and resources.

License

"Mistral-7B-Retail-Banking-v1" is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This license permits free use, modification, and distribution, but it requires appropriate attribution to Bitext.

Key Points of the Apache 2.0 License

  • Permissibility: Free use, modification, and distribution of the software.
  • Attribution: Proper credit must be given to Bitext Innovations International, Inc.
  • No Warranty: The model is provided "as is," without any warranties.

For full details, visit Apache License 2.0.

Downloads last month
1

Finetuned from

Collection including bitext/Mistral-7B-Retail-Banking-v1