sdiehl commited on
Commit
11e653e
1 Parent(s): 839e37d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md CHANGED
@@ -1,3 +1,123 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
+ tags:
6
+ - zsql
7
+ - chatml
8
+ - synthetic data
9
+ - text-to-sql
10
+ - dpo
11
+ datasets:
12
+ - zerolink/zsql-postgres-dpo
13
+ widget:
14
+ - text: >-
15
+ <|im_start|>system
16
+ Translate English to Postgres SQL.<|im_end|>
17
+ <|im_start|>user
18
+ Using the schema:
19
+ CREATE TABLE Product (
20
+ product_id INTEGER PRIMARY KEY,
21
+ name TEXT NOT NULL,
22
+ price DECIMAL NOT NULL,
23
+ description TEXT
24
+ );
25
+ Generate SQL for the following question:
26
+ What are all products worth more than $5.10?
27
+ <|im_end|>
28
+ example_title: sql
29
  ---
30
+
31
+ zsql-postgres is a text-to-SQL model which is instruction tuned for SQL query
32
+ synthesis on English language text to the Postgres SQL code. The model is trained
33
+ on the [ZeroLink DPO](https://huggingface.co/datasets/zerolink/zsql-postgres-dpo)
34
+ dataset.
35
+
36
+ This model is only capable of generating SQL queries and is designed to be
37
+ further fine-tuned to specific database schemas.
38
+
39
+ ## Usage
40
+
41
+ You can run this model using the following code:
42
+
43
+ ```python
44
+ import transformers
45
+ from transformers import AutoTokenizer
46
+
47
+ model = "zerolink/zsql-en-postgres"
48
+
49
+ tokenizer = AutoTokenizer.from_pretrained(model)
50
+
51
+ prompt = f"""
52
+ Using the schema:
53
+ CREATE TABLE Product (
54
+ product_id INTEGER PRIMARY KEY,
55
+ name TEXT NOT NULL,
56
+ price DECIMAL NOT NULL,
57
+ description TEXT
58
+ );
59
+
60
+ CREATE TABLE Customer (
61
+ customer_id INTEGER PRIMARY KEY,
62
+ name TEXT NOT NULL,
63
+ email TEXT,
64
+ phone TEXT
65
+ );
66
+ Generate SQL for the following question:
67
+ What are the prices and descriptions for all products that are greater than $5?
68
+ """
69
+
70
+ system = "Translate English to Postgres SQL."
71
+ message = [
72
+ {"role": "system", "content": system},
73
+ {"role": "user", "content": prompt},
74
+ ]
75
+
76
+ prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
77
+
78
+ # Create pipeline
79
+ pipeline = transformers.pipeline(
80
+ "text-generation",
81
+ model=model,
82
+ tokenizer=tokenizer
83
+ )
84
+
85
+ # Generate text
86
+ sequences = pipeline(
87
+ prompt,
88
+ do_sample=True,
89
+ temperature=0.1,
90
+ top_p=0.9,
91
+ num_return_sequences=1,
92
+ max_length=1024,
93
+ )
94
+ print(sequences[0]['generated_text'])
95
+ ```
96
+
97
+ ## Training hyperparameters
98
+
99
+ **LoRA**:
100
+
101
+ * r=16
102
+ * lora_alpha=16
103
+ * lora_dropout=0.05
104
+ * bias="none"
105
+ * task_type="CAUSAL_LM"
106
+ * target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
107
+
108
+ **Training arguments**:
109
+
110
+ * per_device_train_batch_size=4
111
+ * gradient_accumulation_steps=4
112
+ * gradient_checkpointing=True
113
+ * learning_rate=5e-5
114
+ * lr_scheduler_type="linear"
115
+ * max_steps=200
116
+ * optim="paged_adamw_32bit"
117
+ * warmup_steps=100
118
+
119
+ **DPOTrainer**:
120
+
121
+ * beta=0.1
122
+ * max_prompt_length=4096
123
+ * max_length=3516