HridaAIofficial commited on
Commit
ebf5d44
1 Parent(s): 1355acd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +134 -0
README.md ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ pipeline_tag: text2text-generation
7
+ tags:
8
+ - code
9
+ - sql
10
+ - text-to-sql
11
+ - text2sql
12
+ - t2sql
13
+ ---
14
+
15
+ Introducing Hrida-T2SQL-3B-128k-V0.1, our latest small language model (SLM) tailored for data scientists and industry professionals. This advanced model marks a significant upgrade from our previous release, now equipped with an expanded 128k token context window for handling even the most intricate data queries with precision. Powered by the Phi 3 architecture, it effortlessly converts natural language queries into precise SQL commands, enhancing data analysis efficiency and decision-making capabilities.
16
+
17
+ For full details of this model please read our [blog post](https://www.hridaai.com/blog/t2sql-128k).
18
+
19
+
20
+ ## Prompt Template
21
+
22
+ ```txt
23
+ ### Instruction:
24
+ Provide the system prompt.
25
+
26
+ ### Dialect:
27
+ Specify the SQL dialect (e.g., MySQL, PostgreSQL, SQL Server, etc.).
28
+
29
+ ### Context:
30
+ Provide the database schema including table names, column names, and data types.
31
+
32
+ ### Input:
33
+ User's query.
34
+
35
+ ### Response:
36
+ Expected SQL query output based on the input and context.
37
+
38
+ ```
39
+
40
+ - **Instruction (System Prompt)**: This guides the model on processing input to generate the SQL query response effectively.
41
+ - **Dialect (Optional)**: Specify the SQL variant the model should use to ensure the generated query conforms to the correct syntax.
42
+ - **Context**: Provide the database schema to the model for generating accurate SQL queries.
43
+ - **Input**: Provide the user query for the model to comprehend and transform into an SQL query.
44
+ - **Response**: Expected output from the model.
45
+
46
+
47
+ ## Chat Prompt Template
48
+
49
+ ```txt
50
+ <s>
51
+ <|system|>
52
+ { Instruction / System Prompt }
53
+ <|user|>
54
+ { Context / User Query } <|end|>
55
+ <|assistant|>
56
+ ```
57
+
58
+ ## Run the Model
59
+
60
+ ### Using Transformers
61
+
62
+ ```python
63
+ import torch
64
+ from transformers import AutoModelForCausalLM, AutoTokenizer
65
+
66
+ # Define the model and tokenizer
67
+ model_id = "HridaAI/Hrida-T2SQL-3B-128k-V0.1"
68
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
69
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, trust_remote_code=True)
70
+
71
+ # Define the context and prompt
72
+ prompt = """
73
+ Answer to the query will be in the form of an SQL query.
74
+ ### Context: CREATE TABLE Employees (
75
+ EmployeeID INT PRIMARY KEY,
76
+ FirstName VARCHAR(50),
77
+ LastName VARCHAR(50),
78
+ Age INT,
79
+ DepartmentID INT,
80
+ Salary DECIMAL(10, 2),
81
+ DateHired DATE,
82
+ Active BOOLEAN,
83
+ FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
84
+ );
85
+
86
+ CREATE TABLE Departments (
87
+ DepartmentID INT PRIMARY KEY,
88
+ DepartmentName VARCHAR(100),
89
+ Location VARCHAR(100)
90
+ );
91
+ ### Input: Write a SQL query to select all the employees who are active.
92
+ ### Response:
93
+ """
94
+ # Prepare the input
95
+ messages = [{"role": "user", "content": prompt}]
96
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
97
+
98
+ # Generate the output
99
+ outputs = model.generate(inputs, max_length=300)
100
+ print(tokenizer.decode(outputs[0]))
101
+
102
+
103
+ ```
104
+
105
+ ### Using MLX
106
+
107
+ ```python
108
+ from mlx_lm import generate, load
109
+
110
+ model,tokenizer = load("HridaAI/Hrida-T2SQL-3B-128k-V0.1")
111
+
112
+ prompt = """
113
+ Answer to the quey will be in the form of SQL query.
114
+ ### Context: CREATE TABLE Employees (
115
+ EmployeeID INT PRIMARY KEY,
116
+ FirstName VARCHAR(50),
117
+ LastName VARCHAR(50),
118
+ Age INT,
119
+ DepartmentID INT,
120
+ Salary DECIMAL(10, 2),
121
+ DateHired DATE,
122
+ Active BOOLEAN,
123
+ FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
124
+ );
125
+
126
+ CREATE TABLE Departments (
127
+ DepartmentID INT PRIMARY KEY,
128
+ DepartmentName VARCHAR(100),
129
+ Location VARCHAR(100)
130
+ ); ### Input: Write a SQL query to select all the employees who are active. ### Response:"""
131
+
132
+ response = generate(model=model,tokenizer=tokenizer,prompt=prompt, verbose=True)
133
+
134
+ ```