PostgreSQL raw query vs "Function returns TABLE" - insane difference in performance. Why?

6,820

OK, that was easy. Turns out the database has to prepare the query plan before it knows about the parameters, which leads to bad results. The solution was to use plpgsql and to return QUERY EXECUTE. Now the performance is the same, as expected.

CREATE OR REPLACE FUNCTION GetMyReport(IN fromdate timestamp without time zone, IN todate timestamp without time zone)
  RETURNS TABLE(Name1 character varying, Name2 character varying) AS
$BODY$

BEGIN
RETURN QUERY EXECUTE'
select Column1 as Name1, Column2 as Name2
from sometable tbl
inner join ...
where ...
and ...
and $1 <= somedate
and $2 >= somedate
group by ...
order by ...;' USING $1, $2
END

$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 10
  ROWS 1000;
Share:
6,820

Related videos on Youtube

Evgeny
Author by

Evgeny

Updated on September 18, 2022

Comments

  • Evgeny
    Evgeny over 1 year

    I work with the PostgreSQL and it is used for reporting. The way it is configured currently is as follows:

    There is a complex query which returns report data, like this:

    select Column1 as Name1, Column2 as Name2
    from sometable tbl
    inner join ...
    where ...
    and ...
    and $1 <= somedate
    and $2 >= somedate
    group by ...
    order by ...;
    

    There is a function that utilises this query and is defined like that

    CREATE OR REPLACE FUNCTION GetMyReport(IN fromdate timestamp without time zone, IN todate timestamp without time zone)
      RETURNS TABLE(Name1 character varying, Name2 character varying) AS
    $BODY$
    
    --query start
    select Column1 as Name1, Column2 as Name2
    from sometable tbl
    inner join ...
    where ...
    and ...
    and $1 <= somedate
    and $2 >= somedate
    group by ...
    order by ...;
    --query end
    
    $BODY$
      LANGUAGE sql VOLATILE
      COST 10
      ROWS 1000;
    

    Finally, when the reporting application calls the function, it sends the following SQL:

    select null::text as Name1, Name2 from GetMyReport ('2012-05-28T12:19:39.0000000+11:00'::timestamp, '2012-05-28T12:19:44.0000000+11:00'::timestamp);
    

    And my problem is:

    • When I run just the "query" against the database, it runs quite fast. In fact, in a matter of seconds if the data returned is reasonably small
    • When I run the sql that is passed from the reporting application, it takes insane time to run - each time. In fact, over 10 minutes for the same data that is returned by the query in seconds.
    • In fact, I can run the raw query, takes milliseconds, run the functions - takes ~10 minutes, run query again - milliseconds, run function - again 10 min, all with exactly the same parameters.

    What could be the reason for that?