dataeaze
/

dataeaze-RegLLM-zephyr_7b_beta-dzcompli

+---
+license: cc-by-nc-sa-4.0
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- finance
+- legal
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+RegLLM is LLM model for regulatory compliance. It has been domain adapted by unsupervised pretraining and instruction finetuned for regulatory compliance.
+This release focuses on Indian Banking rules and regulations.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [dataeaze systems pvt ltd](https://www.dataeaze.io/)
+- **Funded by:** [dataeaze systems pvt ltd](https://www.dataeaze.io/)
+- **Shared by:** [dataeaze systems pvt ltd](https://www.dataeaze.io/)
+- **Model type:** MistralForCausalLM
+- **Language(s) (NLP):** English
+- **License:** [cc-by-nc-sa-4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en) Model is made available under non-commercial use for research purposes only. For commercial usage please connect at contactus@dataeaze.io
+- **Finetuned from model:** [zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+The model has been crafted crafted to provide precise and insightful answers to a wide array of queries related to Indian Banking regulations.
+### Downstream Use
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+This model can be used as core component in RegTech application
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+Model has been fine tuned on a specific task of answering questions related to Indian regulatory compliance.
+Any use beyond this is not guaranteed to be accurate.
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+- **Bias:** Trained for English language only (as of now).
+- **Risk:** Guardrails are reliant on the base models Mistral/Zephyr. Finetuning could impact this behaviour.
+- **Limitations:** Intended to be a small model optimised for Indian regulations (as of now).
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+* This model is supposed to be used as an assistive AI technology. Kindly consult and verify with the source documents for decision making.
+* This model should be used with grounding on a set of regulatory documents.
+## How to Get Started with the Model
+```
+import torch
+from transformers import pipeline
+pipe = pipeline("text-generation",
+                model="dataeaze/dataeaze-RegLLM-zephyr_7b_beta-dzcompli",
+                torch_dtype=torch.bfloat16,
+                device_map="auto")
+messages = [
+    {
+        "role": "system",
+        "content": "You are a compliance assistant who answers in a formal manner",
+    },
+    {"role": "user", "content": "How often should IRRBB policies be reviewed?"},
+]
+prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+outputs = pipe(prompt, max_new_tokens=120, do_sample=True, temperature=0.1, top_k=50, top_p=0.95)
+print(outputs[0]["generated_text"])
+```
+## Sample Output
+### Question
+How often should IRRBB policies be reviewed?
+### RegLLM respose
+```
+<|system|>
+You are a compliance assistant who answers in a formal manner</s>
+<|user|>
+How often should IRRBB policies be reviewed?</s>
+<|assistant|>
+At least annually.
+The review should be conducted by a committee consisting of the Chief Risk Officer, the Head of Market Risk, and the Head of Credit Risk.
+The review should include an assessment of the effectiveness of the IRRBB policy and the adequacy of the capital and liquidity management framework.
+The committee should also review the risk appetite and the risk tolerance of the organization.
+The review should be documented and presented to the Board of Directors for approval.
+The policy should be updated as necessary to reflect changes in the organization's business, products, or processes.
+```
+### GPT-4 response
+![gpt4 response](./gpt4_reponse.png)
+### Reference
+For evalating truthfulness / hallucination of this response, refer to RBI notification
+[RBI/2022-23/180
+DOR.MRG.REC.102/00-00-009/2022-23](https://rbidocs.rbi.org.in/rdocs/notification/PDFs/NOTI180CF30A8446A704C11BD8267A8D0BB2AC2.PDF) (page 8)
+Screenshot below
+<img src="rbi_reference.png" alt="drawing" width="500"/>
+As you can see, RegLLM has identified the frequency of IRRBB policies, while GPT-4 provides a more general response.
+Note, that the response of RegLLM is not backed by any external knowledge.
+When coupled with retriever model, RegLLM can provide fairly precise responses to user queries related to regulatory compliance.
+Keep watching this space for more updates on the model and evaluations.
+## Model Card Authors
+* Atharva Inamdar
+* Niranjan Kakade
+* Tony Tom
+* Nayan Chheda
+* Sourabh Daptardar
+## Model Card Contact
+"dataeaze systems" <contactus@dataeaze.io>