Duplicated SQL Statements when running endpoint

#8
by pcapazzi - opened

I created an endpoint to test with and each response includes an attempt to duplicate what was written before.

Example 1:

               SELECT date_trunc('month', transaction_dt) AS MONTH, SUM(transaction_amt) AS total_sales FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) - 1 GROUP BY MONTH ORDER BY MONTH DESC LIMIT 6;
               
               SELECT date_trunc('month', transaction_dt) AS MONTH, SUM(

Example 2:

               SELECT date_trunc('month', transaction_dt) AS MONTH, SUM(transaction_amt) AS total_sales FROM sales GROUP BY MONTH ORDER BY MONTH DESC LIMIT 6;
               
               SELECT date_trunc('month', transaction_dt) AS MONTH, SUM(transaction_amt) AS total_sales FROM sales GROUP BY MONTH ORDER BY MONTH DESC LIMIT 6;

The endpoint seems to cut off abruptly in it's response (see 1st example). Most likely a token limitation. I tried doubling the tokens but it failed to make the changes. Still... I don't think that's a 1000 token response. The prompt wasn't very long either:

You are a Redshift SQL Developer for a clothing retailer. Given the following schema provide the SQL that will answer the question:
create table sales (
transaction_id bigint,
transaction_dt date,
transaction_amt float
store_id int,
store_name varchar(50))

               Question: How did your sales last month compare against sales from the previous month for each store?  Indicate if the difference was significant. Significance determine by  +/- 10%.

The limit of the returned response was indeed tokens. This is my first try with an inference endpoint. I increased the parameter for Max New Tokens and received complete SQL. Then switched over to API and was able to take the parameters and update the Python code.

I switched to 7b in case the multiple statements was something only experienced in 34b alpha. But I still continue to get multiple SQL statements:
inputs:

Given the following schmea provide the SQL that will answer the question:

               create table sales (
                   transaction_id bigint,
                   transaction_dt date,
                   transaction_amt float,
                   store_id int,
                   store_name varchar(50))
               
               Question: How did our sales last month compare against sales from the previous month for each store?

generated_text:

               SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH;
               
               """
    ),

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """


    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE - interval '1 month', CURRENT_DATE - interval '2 month') GROUP BY MONTH ORDER BY MONTH; """

    """ SELECT date_trunc('month', transaction_dt) AS MONTH, COUNT(*) AS sales_count FROM sales WHERE date_part('year', transaction_dt) = date_part('year', CURRENT_DATE) AND date_part('month', transaction_dt) IN (CURRENT_DATE -
Defog.ai org

Hi @pcapazzi , you can try formatting your prompt like our test prompt here: https://github.com/defog-ai/sql-eval/blob/main/prompts/prompt.md
Hopefully that'll give you better results!

Sign up or log in to comment