maitykritadhi commited on
Commit
0f17258
1 Parent(s): 50bc81c

Upload config.py

Browse files
Files changed (1) hide show
  1. config.py +106 -0
config.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from langchain_groq import ChatGroq
3
+ # from sentence_transformers import SentenceTransformer
4
+ from langchain_community.embeddings import SentenceTransformerEmbeddings
5
+
6
+
7
+ path = os.path.abspath(__file__)
8
+ BASE_PATH = os.path.dirname(path)
9
+
10
+ os.environ["GROQ_API_KEY"] = "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"
11
+
12
+ # embeddings = SentenceTransformer("all-mpnet-base-v2")
13
+ embeddings = SentenceTransformerEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
14
+
15
+ db_collection_name = "infosys_docs"
16
+
17
+ max_chunk_tokens = 1000
18
+ chunk_overlap_size = 100
19
+ pdf_download_path = os.path.join(BASE_PATH, "docs")
20
+
21
+ template = """
22
+ You are a chatbot created to answer the questions related to Internal Audit Documents of the companies.
23
+ Answer the question as detailed as possible from the provided context, make sure to provide all the details,
24
+ if the answer is not in provided context just say, "answer is not available in the context", don't provide the wrong answer\n\n
25
+ Context:\n {context}?\n
26
+ Question: \n{question}\n
27
+
28
+ Answer:
29
+ """
30
+
31
+ doc_classification_prompt = """
32
+ You will be given text from an Internal audit document. Internal audit document can be of 3 types as below.
33
+ 1) compliance_report : A compliance audit report for a company evaluates its adherence to regulatory requirements, internal policies, and industry standards, detailing any deviations and suggesting corrective actions. It typically contains an overview of the audit scope, methodology, findings, non-compliance issues, and recommendations for ensuring compliance.
34
+ 2) financial_report : A financial audit report provides an independent assessment of a company's financial statements, ensuring their accuracy and fairness. This file includes the auditor's opinion, a summary of the financial statements, an evaluation of internal controls, and notes on any discrepancies or irregularities discovered during the audit process.
35
+ 3) environment_report : An environmental audit report for a company is a comprehensive document that assesses the company's compliance with environmental regulations, evaluates its environmental performance, and identifies areas for improvement. This file typically includes details on waste management, energy use, emissions, water usage, regulatory compliance, and recommendations for reducing environmental impact. It serves as both a benchmark for current practices and a roadmap for future environmental strategies.
36
+
37
+ Given the below text please classify the document as one of the above 3 document type.
38
+ Additionally, identify the year when the audit report was created.
39
+
40
+ Given Text: {text}
41
+
42
+ Return the output as below JSON only:
43
+ {{
44
+ "type" : "document type" # Choose one of the audit document type from the following -> compliance_report, environmental_report, financial_report,
45
+ "year" : "audit report year" # year when the audit report was created in yyyy format
46
+ }}
47
+ Don't return anything else then JSON.
48
+ JSON OUTPUT:
49
+
50
+ """
51
+
52
+ query_classification = """
53
+ You will be given a query for which information needs to be extracted from the internal audit reports.
54
+ Internal audit documents can be of three types as below.
55
+
56
+ 1) compliance_report : A compliance audit report for a company evaluates its adherence to regulatory requirements, internal policies, and industry standards, detailing any deviations and suggesting corrective actions. It typically contains an overview of the audit scope, methodology, findings, non-compliance issues, and recommendations for ensuring compliance.
57
+ 2) financial_report : A financial audit report provides an independent assessment of a company's financial statements, ensuring their accuracy and fairness. This file includes the auditor's opinion, a summary of the financial statements, an evaluation of internal controls, and notes on any discrepancies or irregularities discovered during the audit process.
58
+ 3) environment_report : An environmental audit report for a company is a comprehensive document that assesses the company's compliance with environmental regulations, evaluates its environmental performance, and identifies areas for improvement. This file typically includes details on waste management, energy use, emissions, water usage, regulatory compliance, and recommendations for reducing environmental impact. It serves as both a benchmark for current practices and a roadmap for future environmental strategies.
59
+
60
+ Identify the type of document from above 3 Internal audit documents which can answer the question asked below.
61
+ Also identify the year of the reports which needs to be considered to answer the question. Please return the year in
62
+ yyyy format.
63
+
64
+ If the question cannot be classifier as one of the above 3 Internal audit documents return empty python list in output JSON "type" key.
65
+ If the year is not mentioned in the question return empty python list in output JSON "year" key.
66
+ Return the output as below JSON format only. Don't return anything else then JSON.
67
+
68
+ sample question: what is the net revenue of Infosys in year and 2024?
69
+ sample response: {{
70
+ "type": ["financial_report"], # list of types
71
+ "year" : ["2024"] # list of years
72
+ }}
73
+
74
+ sample question: How much e-waste generated by Infosys and net revenue of Infosys in FY2020 and FY 2021?
75
+ sample response: {{
76
+ "type": ["financial_report", "environment_report"], # list of types
77
+ "year" : ["2020", "2021"] # list of years
78
+ }}
79
+
80
+ sample question: what are the requirements of data science profile?
81
+ sample response: {{
82
+ "type": [], # empty list in case of not type identified
83
+ "year" : [] # empty list in case of no year identified
84
+ }}
85
+
86
+ question: {question}
87
+ response:
88
+ """
89
+
90
+
91
+ def get_llm_model(model):
92
+ if model == 'gemma-7b-it':
93
+ llm = ChatGroq(temperature=0, model_name="gemma-7b-it")
94
+ elif model == 'mixtral-8x7b-32768':
95
+ llm = ChatGroq(temperature=0, model_name="mixtral-8x7b-32768")
96
+ elif model == 'llama3-70b-8192':
97
+ llm = ChatGroq(temperature=0, model_name="llama3-70b-8192")
98
+ else:
99
+ llm = ChatGroq(temperature=0, model_name="llama3-8b-8192")
100
+ return llm
101
+
102
+
103
+ if __name__ == "__main__":
104
+ llm = get_llm_model("llama3-8b-8192")
105
+ print("done")
106
+