Why is hash join slow in postgresql 9.1?

5,520

Because it thinks that this will be a faster plan.

You have very complicated join of c and u tables. It's so complicated that Postgres can't predict how many rows this join would return — it thinks it will return over 1 600 000 rows when it actually returns only 4.

Try to simplify your query — don't use coalesce or case in joins, maybe separate c.user_id=? and c.expert_id=? into 2 queries and union all them. If there are any red cells in rows x column on explain.depesz.com, then bad and good performance of the query will be very random.

Share:
5,520

Related videos on Youtube

Dmitro
Author by

Dmitro

Updated on September 18, 2022

Comments

  • Dmitro
    Dmitro over 1 year

    I transfered my DB from Postgresql 8.4 to new server with Postgresql 9.1. Size of DB is 9.9GB Data directory is situated on ssd 60GB disk. And server has 16GB RAM and 16 proccessor cores.

    But today load average is grew up to 70.

    I figured out that queries use hash join in plan and one of my query executed in 16m but when i set enable_hashjoin = off it executed in 5m when i set enable_mergejoin = off it become to use nested loop and executed in 12ms.

    Why does postgresql not use optimal query plan?

    EXPLAIN ANALYZE results i pasted to http://explain.depesz.com/s/764 (with enable_hashjoin = on) http://explain.depesz.com/s/weY (with nested loop)

  • Dmitro
    Dmitro about 11 years
    I agree with you that query is bad but Postgresql 8.4 chose optimal query for this bad query and Postgresql 9.1 doesn't do this. And I can't understand why?
  • Tometzky
    Tometzky about 11 years
    As I said — when predicted result counts are so much off then whether server choses bad or good plan is fairy random (but mostly persistent). Not much you can do on server side — you'd need to rewrite this query.
  • voretaq7
    voretaq7 over 8 years
    @Dmitro You could try running a manual VACUUM ANALYZE on the table to update the query planner's statistics, but like Tometzky said if the underlying query is confusing the planner this may not help (and if it does the "help" may not be persistent: the next time the planner statistics refresh it could change its mind again). Rewriting the query to make it easier for the planner to understand is The Right Fix (same as for any database system).