QagentS commited on
Commit
a45fc81
1 Parent(s): f5dd18b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +175 -1
README.md CHANGED
@@ -1,3 +1,177 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - PipableAI/pip-txt-to-sql-spider-bird-dataset
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ tags:
10
+ - document
11
+ - code
12
+ - text2sql
13
+ - instruction_tuned
14
+ - basemodel
15
+ - jax
16
+ - pytorch
17
+ - tensorflow
18
+ - text-generation-inference
19
+ library_name: transformers
20
+ pipeline_tag: text-generation
21
+ widget:
22
+ - text: "<schema>CREATE TABLE system(JobID: String,GID: String, UID: String, Start:Time(yyyy/mm/dd), End: Time,ElapsedRaw: Time, CPUTimeRAW: Time,NCPUS: Number,NNodes: Number, NodeList: List, State:String, Timelimit: Time);</schema><question>Get UID and job id for Jobs that started on Jan 20 , 2023 ended on feb 14 2023 and has job id 20</question><sql>"
23
+ example_title: "example"
24
+
25
  ---
26
+ # pip-parse
27
+
28
+ [pipableAi](https://www.linkedin.com/company/pipable.ai/about/)
29
+
30
+ [colab_notebook]()
31
+
32
+ ## What have we built?
33
+ A 1.3 bn code documentation model that outperforms most models on documenting codes and making your in-house libs ready for LLM and RAG pipelines.
34
+ We have also open sourced a parsing lib for the same , together the lib and model can turn your codebase to functional parse tree ready to be consumed by LLMs to execute complex tasks.
35
+ This is a further trained version of pip-sql-1.3b.
36
+
37
+
38
+
39
+ ## How we built it?
40
+
41
+ We used softmax cross entropy and a modified form of policy grad along with Q loss, optimized in an EM set up.
42
+ Loss behaviour in the set up mentioned above -
43
+
44
+
45
+ ## License
46
+ The model is open source under apache 2.0. License
47
+
48
+ ## Usage
49
+
50
+ ### Installation
51
+
52
+ ```bash
53
+ pip install transformers
54
+ ```
55
+
56
+ ### Prompt
57
+ ```python
58
+ prompt = f"""<code>{schema}</code>
59
+ <question>Document the code above</question>
60
+ <doc>"""
61
+ ```
62
+
63
+ ### PyTorch
64
+ ```python
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer
66
+ device = "cuda"
67
+ model = AutoModelForCausalLM.from_pretrained("PipableAI/pip-parser")
68
+ tokenizer = AutoTokenizer.from_pretrained("PipableAI/pip-parser")
69
+
70
+ inputs = tokenizer(text, return_tensors="pt")
71
+ outputs = model.generate(**inputs, max_new_tokens=300)
72
+ tokenizer.decode(outputs[0], skip_special_tokens=True).split('<doc>')[-1].split('</doc>')[0]
73
+ ```
74
+
75
+
76
+
77
+ ## Examples
78
+
79
+ ### Code
80
+ ```python
81
+ <code>
82
+ ###########################
83
+ # Generate Analytical Model
84
+ ###########################
85
+ ##################################################
86
+ # func: get_np_array_transition_probability_matrix
87
+ ##################################################
88
+ def get_np_array_transition_probability_matrix(int_num_states, np_array_A_matrix):
89
+ print('np_array_A_matrix:')
90
+ print(np_array_A_matrix)
91
+ #####################################################
92
+ # Perturb the adjacency matrix to avoid singularities
93
+ #####################################################
94
+ np_array_A_matrix += (np.full((int_num_states, int_num_states), float_eps) - (np.identity(int_num_states) * float_eps))
95
+ print('np_array_A_matrix:')
96
+ print(np_array_A_matrix)
97
+ print('np_array_D_matrix:')
98
+ np_array_D_matrix = np.diag(np.sum(np_array_A_matrix, axis=1))
99
+ print(np_array_D_matrix)
100
+ print('np_array_D_matrix_inv:')
101
+ np_array_D_matrix_inv = np.linalg.inv(np_array_D_matrix)
102
+ print(np_array_D_matrix_inv)
103
+ print('\n\n')
104
+ print('np_array_P_matrix:')
105
+ np_array_P_matrix = np.dot(np_array_D_matrix_inv, np_array_A_matrix)
106
+ print(np_array_P_matrix)
107
+ print('np.sum(np_array_P_matrix, axis=1):')
108
+ print(np.sum(np_array_P_matrix, axis=1))
109
+ print('\n\n')
110
+ return np_array_P_matrix
111
+ ##################################################
112
+ # func: get_np_array_perron_frobenius_eigen_vector
113
+ ##################################################
114
+ def get_np_array_perron_frobenius_matrix(int_num_states, np_array_P_matrix):
115
+ np_array_perron_frobenius_matrix = np.linalg.matrix_power(np_array_P_matrix,1000)
116
+ np_array_perron_frobenius_vector = np_array_perron_frobenius_matrix[0,:]
117
+ print('np_array_perron_frobenius_matrix:')
118
+ print(np_array_perron_frobenius_matrix)
119
+ print('np.sum(np_array_perron_frobenius_matrix, axis=1):')
120
+ print(np.sum(np_array_perron_frobenius_matrix, axis=1))
121
+ print('np.sum(np_array_perron_frobenius_matrix, axis=0):')
122
+ print(np.sum(np_array_perron_frobenius_matrix, axis=0))
123
+ print('np.sum(np_array_perron_frobenius_matrix, axis=0)/int_num_states:')
124
+ print(np.sum(np_array_perron_frobenius_matrix, axis=0)/int_num_states)
125
+ print('np.dot(np_array_perron_frobenius_vector, np_array_P_matrix):')
126
+ print(np.dot(np_array_perron_frobenius_vector, np_array_P_matrix))
127
+ print('np_array_perron_frobenius_vector:')
128
+ print(np_array_perron_frobenius_vector)
129
+ print('\n\n')
130
+ return np_array_perron_frobenius_vector, np_array_perron_frobenius_matrix
131
+ #############################
132
+ # func: get_np_array_Z_matrix
133
+ #############################
134
+ def get_np_array_Z_matrix(int_num_states, np_array_P_matrix, np_array_perron_frobenius_matrix):
135
+ np_array_Z_matrix = np.linalg.inv(np.identity(int_num_states) - np_array_P_matrix + np_array_perron_frobenius_matrix)
136
+ print('np_array_Z_matrix:')
137
+ print(np_array_Z_matrix)
138
+ print('\n\n')
139
+ return(np_array_Z_matrix)
140
+ #############################
141
+ # func: get_np_array_H_matrix
142
+ #############################
143
+ def get_np_array_H_matrix(int_num_states, np_array_Z_matrix, np_array_perron_frobenius_vector):
144
+ np_array_H_matrix = np.zeros([int_num_states, int_num_states])
145
+ for i in range(int_num_states):
146
+ for j in range(int_num_states):
147
+ np_array_H_matrix[i][j] = (np_array_Z_matrix[j][j] - np_array_Z_matrix[i][j])/np_array_perron_frobenius_vector[j]
148
+ print('np_array_H_matrix:')
149
+ print(np_array_H_matrix)
150
+ print('\n\n')
151
+ return np_array_H_matrix
152
+ ###########
153
+ # func: run
154
+ ###########
155
+ def run(np_array_A_matrix):
156
+ int_num_states = len(np_array_A_matrix)
157
+ np_array_P_matrix = get_np_array_transition_probability_matrix(int_num_states, np_array_A_matrix)
158
+ np_array_perron_frobenius_vector, np_array_perron_frobenius_matrix = get_np_array_perron_frobenius_matrix(int_num_states, np_array_P_matrix)
159
+ np_array_Z_matrix = get_np_array_Z_matrix(int_num_states, np_array_P_matrix, np_array_perron_frobenius_matrix)
160
+ np_array_H_matrix = get_np_array_H_matrix(int_num_states, np_array_Z_matrix, np_array_perron_frobenius_vector)
161
+ return(np_array_H_matrix)
162
+ <question>Document the python code above.
163
+ </question><doc>
164
+ ```
165
+
166
+ ### Response
167
+ What are the email address, town and county of the customers who are of the least common gender?
168
+ ```python
169
+ The Python code provided is used to generate an analytical model for a Markov chain with a given adjacency matrix.
170
+ The model is then used to compute the Perron-Frobenius eigenvector and the corresponding matrix. The resulting matrices are then used to compute the Z-matrix and
171
+ the H-matrix. The H-matrix is then returned as the output of the function. The code is designed to handle large matrices and perform computations efficiently.
172
+ The matrices are manipulated using numpy's powerful and efficient numerical computation library.
173
+ The code also includes comments to explain the functionality of each part of the code.
174
+ ```
175
+
176
+ ### Team
177
+ Avi Kothari, Pratham Gupta, Ritvik Aryan Kalra, Rohan Bhatial, Soham Acharya