Redshift: Executing a dynamic query from a string

21,627

Solution 1

This is possible now that we have added support for Stored Procedures. "Overview of Stored Procedures in Amazon Redshift"

For example, this stored procedure counts the rows in a table and inserts the table name and row count into another table. Both table names are provided as input.

CREATE PROCEDURE get_tbl_count(IN source_tbl VARCHAR, IN count_tbl VARCHAR) AS $$
BEGIN
EXECUTE 'INSERT INTO ' || quote_ident(count_tbl) 
        || ' SELECT ''' || source_tbl ||''', COUNT(*) FROM ' 
        || quote_ident(source_tbl) || ';' 
RETURN;
END;
$$ LANGUAGE plpgsql;

In your example the query to executed could be passed in as a string.

Solution 2

No. There is not a straightforward way to run dynamic built SQL code in Redshift.

You can't define SQL variables, or create stored procedures, as you would have in MS SQL Server.

You can create Python Functions in Redshift, but you would be coding in Python vs. SQL.

You can use the "PREPARE" and "EXECUTE" statements to run "pre-defined" SQL queries, but you would have to create the statements outside of the database, before passing them to the execute command. By creating the statement outside of the database, in a way defeats the purpose.... You can create any statement in your "favorite" programming language.

As I said, this SQL based, in-database dynamic SQL does not exist.

Basically, you need to run this logic in your application or using something such as AWS Data Pipeline.

Solution 3

I am using Postgre on Redshift, and I ran into this issue and found a solution.

I was trying to create a dynamic query, putting in my own date.

date = dt.date(2018, 10, 30)

query = ''' select * from table where date >= ''' + str(my_date) + ''' order by date '''

But, the query entirely ignores the condition when typing it this way.

However, if you use the percent sign (%), you can insert the date correctly.

The correct way to write the above statement is:

query = ''' select * from table where date >= ''' + ''' '%s' ''' % my_date + ''' order by date '''

So, maybe this is helpful, or maybe it is not. I hope it helps at least one person in my situation!

Best wishes.

Share:
21,627
anahnarciso
Author by

anahnarciso

Computer engineer working in Business Intelligence.

Updated on July 09, 2022

Comments

  • anahnarciso
    anahnarciso almost 2 years

    I would like to execute a dynamic SQL query stored in a string field on Amazon Redshift.

    My background is mostly T-SQL relational databases. I used to build SQL statements dynamically, store them into variables and them execute them. I know Redshift can prepare and execute statements, but I wonder if it is possible to execute a query stored in a string field.

    I have a piece of code that dynamically builds the code below with stats on several tables using pg_* system tables. Every column/table name is dynamically calculated. Here's an example of the query output:

    SELECT h_article_id AS key, 'transport_parameters_weight_in_grams' AS col_name, COUNT(DISTINCT transport_parameters_weight_in_grams) AS count_value FROM dv.s_products GROUP BY h_article_id UNION ALL
    SELECT h_article_id AS key, 'transport_parameters_width_in_mm' AS col_name, COUNT(DISTINCT transport_parameters_width_in_mm) AS count_value FROM dv.s_products GROUP BY h_article_id UNION ALL
    SELECT h_article_id AS key, 'label_owner_info_communication_address' AS col_name, COUNT(DISTINCT label_owner_info_communication_address) AS count_value FROM dv.s_products GROUP BY h_article_id
    

    I would like to input this dynamic piece of code within another query, so I can make some statistics, like so:

    SELECT col_name, AVG(count_value*1.00) AS avg_count
    FROM (
      'QUERY ABOVE'
    ) A
    GROUP BY col_name;
    

    This would ouput something like:

    col_name                                avg_count
    transport_parameters_weight_in_grams    1.00
    transport_parameters_width_in_mm        1.00
    label_owner_info_communication_address  0.60
    

    The natural way for me to do this would be to store everything as a string in a variable and execute it. But I'm afraid Redshift does not support this.

    Is there an alternative way to really build dynamic SQL code?

  • anahnarciso
    anahnarciso almost 8 years
    Thanks for your response. You say "you would have to create the statements outside of the database, before passing them to the execute". I know I can pass statements, but they are almost hard coded (aside from the $ variable). Imagine I have a table with a column containing several SQL queries. Would it be possible to pass those queries to the EXECUTE command?
  • BigDataKid
    BigDataKid almost 8 years
    Yes, it is possible, but youhave to "PREPARE" them first.
  • anahnarciso
    anahnarciso almost 8 years
    Ok, but I still don't know how would that be possible. Consider a table A with a varchar column "query" containing select queries. This is the example of a prepare statement, but it seems too hardcoded. PREPARE prep_select_plan (int) AS select * from prep1 where c1 = $1; How would I reference a query that is contained in a varchar field of another table?
  • Angus
    Angus over 4 years
    How about if you needed to add an IF statement? For example if you needed to add a WHERE or an AND to the SELECT statement based on the vale of one of the parameters. Would you declare a string, set that up and then EXECUTE it?
  • Joe Harris
    Joe Harris over 4 years
    Several example stored procedure are provided on GitHub. They have examples of using conditional dynamic SQL creation. github.com/awslabs/amazon-redshift-utils/tree/master/src/…