Bitext commited on
Commit
76221c9
1 Parent(s): 19620f2

Create README.md

Browse files

Update README.md detailing the fine-tuned model's training data, architecture, and intended use.

Files changed (1) hide show
  1. README.md +89 -0
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - axolotl
5
+ - generated_from_trainer
6
+ - text-generation-inference
7
+ base_model: mistralai/Mistral-7B-Instruct-v0.2
8
+ model_type: mistral
9
+ pipeline_tag: text-generation
10
+ model-index:
11
+ - name: Mistral-7B-Retail-v2
12
+ results: []
13
+ ---
14
+
15
+ # Mistral-7B-Retail-v2
16
+
17
+ ## Model Description
18
+
19
+ This model, named "Mistral-7B-Retail-v2," is a specially adjusted version of the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2). It is fine-tuned to manage questions and provide answers related to retail services.
20
+
21
+ ## Intended Use
22
+
23
+ - **Recommended applications**: This model is perfect for use in retail environments. It can be integrated into customer service chatbots or help systems to provide real-time responses to common retail-related inquiries.
24
+ - **Out-of-scope**: This model should not be used for medical, legal, or any serious safety-related purposes.
25
+
26
+ ## Usage Example
27
+
28
+ ```python
29
+ from transformers import AutoModelForCausalLM, AutoTokenizer
30
+
31
+ model = AutoModelForCausalLM.from_pretrained("bitext-llm/Mistral-7B-Retail-v2")
32
+ tokenizer = AutoTokenizer.from_pretrained("bitext-llm/Mistral-7B-Retail-v2")
33
+
34
+ inputs = tokenizer("<s>[INST] How can I return a purchased item? [/INST]", return_tensors="pt")
35
+ outputs = model.generate(inputs['input_ids'], max_length=50)
36
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
37
+ ```
38
+
39
+ ## Model Architecture
40
+
41
+ The "Mistral-7B-Retail-v2" uses the `MistralForCausalLM` structure with a `LlamaTokenizer`. It maintains the setup of the base model but is enhanced to better respond to retail-related questions.
42
+
43
+ ## Training Data
44
+
45
+ This model was trained with a dataset specifically designed for retail-related question and answer interactions. The dataset encompasses a comprehensive range of retail intents, ensuring the model is trained to handle diverse customer inquiries and scenarios. It includes 46 distinct intents such as `add_product`, `availability_in_store`, `cancel_order`, `pay`, `refund_policy`, `track_order`, `use_app`, and many more, reflecting common retail transactions and customer service interactions. Each intent contains 1000 examples, which helps in creating responses across various retail situations.
46
+
47
+ This extensive training dataset ensures that the model can understand and respond to a wide array of retail-related queries, providing support in customer service applications. The dataset follows a structured approach, similar to other datasets published on Hugging Face, but is specifically tailored to cater to the retail sector.
48
+
49
+ ## Training Procedure
50
+
51
+ ### Hyperparameters
52
+
53
+ - **Optimizer**: AdamW
54
+ - **Learning Rate**: 0.0002
55
+ - **Epochs**: 1
56
+ - **Batch Size**: 8
57
+ - **Gradient Accumulation Steps**: 4
58
+ - **Maximum Sequence Length**: 1024 tokens
59
+
60
+ ### Environment
61
+
62
+ - **Transformers Version**: 4.40.0.dev0
63
+ - **Framework**: PyTorch 2.2.1+cu121
64
+ - **Tokenizers**: Tokenizers 0.15.0
65
+
66
+ ## Limitations and Bias
67
+
68
+ - The model is specifically trained for retail and may not give accurate results outside this domain.
69
+ - There might be biases in the training data, so it is important to review the model's responses carefully.
70
+
71
+ ## Ethical Considerations
72
+
73
+ It is crucial to use this model responsibly, especially in scenarios involving personal customer interactions. Care should be taken to ensure that the model's use does not replace necessary human judgment in sensitive situations.
74
+
75
+ ## Acknowledgments
76
+
77
+ This model was developed by Bitext and trained using their infrastructure.
78
+
79
+ ## License
80
+
81
+ "Mistral-7B-Retail-v2" is licensed under Apache License 2.0 by Bitext Innovations International, Inc. This license allows anyone to use, modify, and distribute the model freely while ensuring Bitext is credited for their work.
82
+
83
+ ### Key Points of the Apache 2.0 License
84
+
85
+ - **Permissibility**: Free use, modification, and distribution.
86
+ - **Attribution**: Credit must be given to Bitext, aligning with the copyright notices and the license.
87
+ - **No Warranty**: The model is provided without any guarantees.
88
+
89
+ For complete details, visit [Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0).