HridaAIofficial commited on
Commit
4eae6c8
1 Parent(s): 39c1184

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +133 -0
README.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: text2text-generation
7
+ tags:
8
+ - code
9
+ - sql
10
+ - text-to-sql
11
+ - test2sql
12
+ ---
13
+
14
+ Introducing Hrida-T2SQL-3B-128k-V0.1, our latest small language model (SLM) tailored for data scientists and industry professionals. This advanced model marks a significant upgrade from our previous release, now equipped with an expanded 128k token context window for handling even the most intricate data queries with precision. Powered by the Phi 3 architecture, it effortlessly converts natural language queries into precise SQL commands, enhancing data analysis efficiency and decision-making capabilities.
15
+
16
+ For full details of this model please read our [blog post](https://www.hridaai.com/blog/t2sql).
17
+
18
+
19
+ ## Prompt Template
20
+
21
+ ```txt
22
+ ### Instruction:
23
+ Provide the system prompt.
24
+
25
+ ### Dialect:
26
+ Specify the SQL dialect (e.g., MySQL, PostgreSQL, SQL Server, etc.).
27
+
28
+ ### Context:
29
+ Provide the database schema including table names, column names, and data types.
30
+
31
+ ### Input:
32
+ User's query.
33
+
34
+ ### Response:
35
+ Expected SQL query output based on the input and context.
36
+
37
+ ```
38
+
39
+ - **Instruction (System Prompt)**: This guides the model on processing input to generate the SQL query response effectively.
40
+ - **Dialect (Optional)**: Specify the SQL variant the model should use to ensure the generated query conforms to the correct syntax.
41
+ - **Context**: Provide the database schema to the model for generating accurate SQL queries.
42
+ - **Input**: Provide the user query for the model to comprehend and transform into an SQL query.
43
+ - **Response**: Expected output from the model.
44
+
45
+
46
+ ## Chat Prompt Template
47
+
48
+ ```txt
49
+ <s>
50
+ <|system|>
51
+ { Instruction / System Prompt }
52
+ <|user|>
53
+ { Context / User Query } <|end|>
54
+ <|assistant|>
55
+ ```
56
+
57
+ ## Run the Model
58
+
59
+ ### Using Transformers
60
+
61
+ ```python
62
+ import torch
63
+ from transformers import AutoModelForCausalLM, AutoTokenizer
64
+
65
+ # Define the model and tokenizer
66
+ model_id = "HridaAI/Hrida-T2SQL-3B-128k-V0.1"
67
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
68
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, trust_remote_code=True)
69
+
70
+ # Define the context and prompt
71
+ prompt = """
72
+ Answer to the query will be in the form of an SQL query.
73
+ ### Context: CREATE TABLE Employees (
74
+ EmployeeID INT PRIMARY KEY,
75
+ FirstName VARCHAR(50),
76
+ LastName VARCHAR(50),
77
+ Age INT,
78
+ DepartmentID INT,
79
+ Salary DECIMAL(10, 2),
80
+ DateHired DATE,
81
+ Active BOOLEAN,
82
+ FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
83
+ );
84
+
85
+ CREATE TABLE Departments (
86
+ DepartmentID INT PRIMARY KEY,
87
+ DepartmentName VARCHAR(100),
88
+ Location VARCHAR(100)
89
+ );
90
+ ### Input: Write a SQL query to select all the employees who are active.
91
+ ### Response:
92
+ """
93
+ # Prepare the input
94
+ messages = [{"role": "user", "content": prompt}]
95
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
96
+
97
+ # Generate the output
98
+ outputs = model.generate(inputs, max_length=300)
99
+ print(tokenizer.decode(outputs[0]))
100
+
101
+
102
+ ```
103
+
104
+ ### Using MLX
105
+
106
+ ```python
107
+ from mlx_lm import generate, load
108
+
109
+ model,tokenizer = load("HridaAI/Hrida-T2SQL-3B-128k-V0.1")
110
+
111
+ prompt = """
112
+ Answer to the quey will be in the form of SQL query.
113
+ ### Context: CREATE TABLE Employees (
114
+ EmployeeID INT PRIMARY KEY,
115
+ FirstName VARCHAR(50),
116
+ LastName VARCHAR(50),
117
+ Age INT,
118
+ DepartmentID INT,
119
+ Salary DECIMAL(10, 2),
120
+ DateHired DATE,
121
+ Active BOOLEAN,
122
+ FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
123
+ );
124
+
125
+ CREATE TABLE Departments (
126
+ DepartmentID INT PRIMARY KEY,
127
+ DepartmentName VARCHAR(100),
128
+ Location VARCHAR(100)
129
+ ); ### Input: Write a SQL query to select all the employees who are active. ### Response:"""
130
+
131
+ response = generate(model=model,tokenizer=tokenizer,prompt=prompt, verbose=True)
132
+
133
+ ```