tdoehmen commited on
Commit
8db7d1e
·
verified ·
1 Parent(s): 115a8e0

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -26
README.md CHANGED
@@ -40,13 +40,24 @@ In contrast to existing text-to-SQL models, the SQL generation is not contrained
40
 
41
  ## How to Use
42
 
 
 
 
 
 
 
 
43
  Example 1:
44
 
 
45
  ```python
46
- import torch
47
- from transformers import AutoTokenizer, AutoModelForCausalLM
48
- tokenizer = AutoTokenizer.from_pretrained("motherduckdb/DuckDB-NSQL-7B-v0.1")
49
- model = AutoModelForCausalLM.from_pretrained("motherduckdb/DuckDB-NSQL-7B-v0.1", torch_dtype=torch.bfloat16)
 
 
 
50
 
51
  text = """### Instruction:
52
  Your task is to generate valid duckdb SQL to answer the following question.
@@ -59,20 +70,21 @@ create a new table called tmp from test.csv
59
  ### Response (use duckdb shorthand if possible):
60
  """
61
 
62
- input_ids = tokenizer(text, return_tensors="pt").input_ids
63
-
64
- generated_ids = model.generate(input_ids, max_length=500)
65
- print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
66
  ```
67
 
68
  Example 2:
69
 
70
  ```python
71
- import torch
72
- from transformers import AutoTokenizer, AutoModelForCausalLM
73
- tokenizer = AutoTokenizer.from_pretrained("motherduckdb/DuckDB-NSQL-7B-v0.1")
74
- model = AutoModelForCausalLM.from_pretrained("motherduckdb/DuckDB-NSQL-7B-v0.1", torch_dtype=torch.bfloat16)
75
-
 
 
76
  text = """### Instruction:
77
  Your task is to generate valid duckdb SQL to answer the following question, given a duckdb database schema.
78
 
@@ -97,20 +109,21 @@ get all columns ending with _amount from taxi table
97
 
98
  ### Response (use duckdb shorthand if possible):"""
99
 
100
- input_ids = tokenizer(text, return_tensors="pt").input_ids
101
-
102
- generated_ids = model.generate(input_ids, max_length=500)
103
- print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
104
  ```
105
 
106
  Example 3:
107
 
108
  ```python
109
- import torch
110
- from transformers import AutoTokenizer, AutoModelForCausalLM
111
- tokenizer = AutoTokenizer.from_pretrained("motherduckdb/DuckDB-NSQL-7B-v0.1")
112
- model = AutoModelForCausalLM.from_pretrained("motherduckdb/DuckDB-NSQL-7B-v0.1", torch_dtype=torch.bfloat16)
113
-
 
 
114
  text = """### Instruction:
115
  Your task is to generate valid duckdb SQL to answer the following question, given a duckdb database schema.
116
 
@@ -135,10 +148,9 @@ get longest trip in december 2022
135
  ### Response (use duckdb shorthand if possible):
136
  """
137
 
138
- input_ids = tokenizer(text, return_tensors="pt").input_ids
139
-
140
- generated_ids = model.generate(input_ids, max_length=500)
141
- print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
142
  ```
143
 
144
 
 
40
 
41
  ## How to Use
42
 
43
+ Setup llama.cpp:
44
+ ```shell
45
+ CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
46
+ huggingface-cli download motherduckdb/DuckDB-NSQL-7B-v0.1-GGUF DuckDB-NSQL-7B-v0.1-q8_0.gguf --local-dir . --local-dir-use-symlinks False
47
+ pip install wurlitzer
48
+ ```
49
+
50
  Example 1:
51
 
52
+
53
  ```python
54
+ ## Setup - Llama.cpp
55
+ from llama_cpp import Llama
56
+ with pipes() as (out, err):
57
+ llama = Llama(
58
+ model_path="DuckDB-NSQL-7B-v0.1-q8_0.gguf",
59
+ n_ctx=2048,
60
+ )
61
 
62
  text = """### Instruction:
63
  Your task is to generate valid duckdb SQL to answer the following question.
 
70
  ### Response (use duckdb shorthand if possible):
71
  """
72
 
73
+ with pipes() as (out, err):
74
+ pred = llama(text, temperature=0.1, max_tokens=500)
75
+ print(pred["choices"][0]["text"])
 
76
  ```
77
 
78
  Example 2:
79
 
80
  ```python
81
+ from llama_cpp import Llama
82
+ with pipes() as (out, err):
83
+ llama = Llama(
84
+ model_path="DuckDB-NSQL-7B-v0.1-q8_0.gguf",
85
+ n_ctx=2048,
86
+ )
87
+
88
  text = """### Instruction:
89
  Your task is to generate valid duckdb SQL to answer the following question, given a duckdb database schema.
90
 
 
109
 
110
  ### Response (use duckdb shorthand if possible):"""
111
 
112
+ with pipes() as (out, err):
113
+ pred = llama(text, temperature=0.1, max_tokens=500)
114
+ print(pred["choices"][0]["text"])
 
115
  ```
116
 
117
  Example 3:
118
 
119
  ```python
120
+ from llama_cpp import Llama
121
+ with pipes() as (out, err):
122
+ llama = Llama(
123
+ model_path="DuckDB-NSQL-7B-v0.1-q8_0.gguf",
124
+ n_ctx=2048,
125
+ )
126
+
127
  text = """### Instruction:
128
  Your task is to generate valid duckdb SQL to answer the following question, given a duckdb database schema.
129
 
 
148
  ### Response (use duckdb shorthand if possible):
149
  """
150
 
151
+ with pipes() as (out, err):
152
+ pred = llama(text, temperature=0.1, max_tokens=500)
153
+ print(pred["choices"][0]["text"])
 
154
  ```
155
 
156