How to understand an EXPLAIN ANALYZE

32,860

While not as useful for a simple plan like this, http://explain.depesz.com is really useful. See http://explain.depesz.com/s/t4fi. Note the "stats" tab and the "options" pulldown.

Things to note about this plan:

  • The estimated row count (183) is reasonably comparable to the actual row count (25). It's not hundreds of times more, nor is it 1. You're more interested in orders of magnitude when it comes to rowcount estimates, or "1 vs not 1" issues. (You don't even need "close enough for government work" accuracy - "close enough for military contracting accounting" will do). The selectivity estimate and statistics seem reasonable.

  • It's using the two-column partial index provided (index scan using index_cars_onsale_on_brand_and_model_name), so it's matched the partial index condition. You can see that in the Filter: has_auto_gear. The index search condition is also shown.

  • The query performance looks reasonable given that the table's row count will mean the index is fairly big, especially as it's over two columns. Matching rows will be scattered, so it's likely each row will require a separate page read too.

I see nothing wrong here. This query will likely benefit greatly from PostgreSQL 9.2's index-only scans, though.

It's possible there's some table bloat here, but given the 2-column index and the sheer number of rows the response time isn't entirely unreasonable, especially for a table with 170 (!!) columns that's likely to fit relatively few tuples into each page. If you can afford some downtime try VACUUM FULL to reorganize the table and rebuild the index. This will exclusively lock the table for some time while it rebuilds it. If you can't afford the downtime, see pg_reorg and/or CREATE INDEX CONCURRENTLY and ALTER INDEX ... RENAME TO.

You might find EXPLAIN (ANALYZE, BUFFERS, VERBOSE) more informative sometimes, as it can show buffer accesses, etc.

One option that may make this query faster (though it runs the risk of slowing other queries somewhat) is to partition the table on brand and enable constraint_exclusion. See partitioning.

Share:
32,860

Related videos on Youtube

Niels Kristian
Author by

Niels Kristian

Rails entrepreneur based in Denmark

Updated on July 09, 2022

Comments

  • Niels Kristian
    Niels Kristian almost 2 years

    I am not very familiar with looking at EXPLAIN ANALYZE results, I have a huge problem with my queries being too slow. I have tried to read up on how to interpret results from an explain queries, but I still don't know what I should be looking for, and what might be wrong. I have a feeling that there is some big red light flashing somewhere, I just don't see it.

    So the query is pretty simple, it looks like this:

    EXPLAIN ANALYZE SELECT "cars".* FROM "cars" WHERE "cars"."sales_state" = 'onsale' AND "cars"."brand" = 'BMW' AND "cars"."model_name" = '318i' AND "cars"."has_auto_gear" = TRUE  LIMIT 25 OFFSET 0
    

    And the result like this:

    Limit  (cost=0.00..161.07 rows=25 width=1245) (actual time=35.232..38.694 rows=25 loops=1)
      ->  Index Scan using index_cars_onsale_on_brand_and_model_name on cars  (cost=0.00..1179.06 rows=183 width=1245) (actual time=35.228..38.652 rows=25 loops=1)
            Index Cond: (((brand)::text = 'BMW'::text) AND ((model_name)::text = '318i'::text))
            Filter: has_auto_gear"
    Total runtime: 38.845 ms
    

    A little background: I'm on Postgresql 9.1.6, running on Herokus dedicated databases. My db has aprox 7,5Gb RAM, the table cars contains 3,1M rows and an aprox 2,0M of the rows has sales_state = 'onsale'. The table has 170 columns. The index that it uses looks something like this:

    CREATE INDEX index_cars_onsale_on_brand_and_model_name
      ON cars
      USING btree
      (brand COLLATE pg_catalog."default" , model_name COLLATE pg_catalog."default" )
      WHERE sales_state::text = 'onsale'::text;
    

    Anyone seeing some big obvious issue?

    EDIT:

    SELECT pg_relation_size('cars'), pg_total_relation_size('cars');
    

    pg_relation_size: 2058444800 pg_total_relation_size: 4900126720

    SELECT pg_relation_size('index_cars_onsale_on_brand_and_model_name');
    

    pg_relation_size: 46301184

    SELECT avg(pg_column_size(cars)) FROM cars limit 5000;
    

    avg: 636.9732567210792995

    WITHOUT THE LIMIT:

    EXPLAIN ANALYZE SELECT "cars".* FROM "cars" WHERE "cars"."sales_state" = 'onsale' AND "cars"."brand" = 'BMW' AND "cars"."model_name" = '318i' AND "cars"."has_auto_gear" = TRUE
    
    Bitmap Heap Scan on cars  (cost=12.54..1156.95 rows=183 width=4) (actual time=17.067..55.198 rows=2096 loops=1)
      Recheck Cond: (((brand)::text = 'BMW'::text) AND ((model_name)::text = '318i'::text) AND ((sales_state)::text = 'onsale'::text))
      Filter: has_auto_gear
      ->  Bitmap Index Scan on index_cars_onsale_on_brand_and_model_name  (cost=0.00..12.54 rows=585 width=0) (actual time=15.211..15.211 rows=7411 loops=1)"
            Index Cond: (((brand)::text = 'BMW'::text) AND ((model_name)::text = '318i'::text))
    Total runtime: 56.851 ms
    
    • Neville Kuyt
      Neville Kuyt over 11 years
      Try vacuuming - postgresql.org/docs/8.1/static/maintenance.html. The query plan looks reasonable, but the time certainly doesn't!
    • Niels Kristian
      Niels Kristian over 11 years
      Just before I ran the query, I ran a full vacuum and an analyze as well... My website is a search engine for used cars, so the timing is by far unacceptable. My goal is to get the total time down to less than 1 sec. Do you think that is at all possible, or will I have to look for a different technology than a rational database?
    • Craig Ringer
      Craig Ringer over 11 years
      @NielsKristian I think a big part of the problem might be the "170 columns" part. How big is the table? SELECT pg_relation_size('cars'), pg_total_relation_size('cars');. Also SELECT pg_relation_size('index_cars_onsale_on_brand_and_model_name'‌​); to get the index size. What's the average row width? SELECT avg(pg_column_size(cars)) FROM test cars limit 5000;
    • jcern
      jcern over 11 years
      I don't see it posted above, but I was wondering if you also have an index on has_auto_gear?
    • Craig Ringer
      Craig Ringer over 11 years
      @jcern There's a partial index with a filter condition on has_auto_gear. Same thing but faster for this purpose.
    • Neville Kuyt
      Neville Kuyt over 11 years
      Can you tell us what happens when you remove the LIMIT clause?
    • Niels Kristian
      Niels Kristian over 11 years
      @CraigRinger I updated the stats in the EDIT, above... :-)
    • Niels Kristian
      Niels Kristian over 11 years
      @NevilleK I udated with the stats above
    • Craig Ringer
      Craig Ringer over 11 years
      That's a 4.5GB table including TOAST tables and indexes; 2GB for the raw table without external storage. The index is tiny, though, 44MB. Each row is 600 bytes wide on average, which is pretty big but not insanely huge. I'd expect better performance than this. I'd be curious to know how it performed if you dumped the table, loaded it onto a local PostgreSQL instance on a half-decent computer and tested it there.
    • Niels Kristian
      Niels Kristian about 8 years
      Another great tool to look at is: tatiyants.com/pev/#/plans/new
  • Niels Kristian
    Niels Kristian over 11 years
    Hi, thanks for your explanations, Just before I ran the query, I did a full vacuum
  • Craig Ringer
    Craig Ringer over 11 years
    @NielsKristian Looks like milliseconds to me. At a guess you're reading it in DE style notation, eg "123.456.789,50" instead of US/AU/UK style "123,456,789.50". AFAIK Pg doesn't localize times in EXPLAIN ANALYZE, so it should be 35 milliseconds.
  • Craig Ringer
    Craig Ringer over 11 years
    @NielsKristian Consider partitioning if all else fails; see edit to answer. Also, consider editing your question to show EXPLAIN (ANALYZE, BUFFERS, VERBOSE) results.
  • Niels Kristian
    Niels Kristian over 11 years
    Thanks for all the great help! I have surely gotten a whole lot more knowledge about debugging SQL. The solution turns out to reduce the number of columns in the table, and be more precise on what I select. Secondly I have made some better fitting indexes.
  • B Seven
    B Seven almost 6 years
    I'll keep this important info in mind. "close enough for military contracting accounting".
  • Todd
    Todd over 5 years
    @CraigRinger thanks for your detailed explanation. Would you mind also adding in a bit about the meanings behind the cost values? They seem arbitrary to me, just that "high is bad, low is good".
  • Craig Ringer
    Craig Ringer over 5 years
    @Todd Pretty much. The estimates are somewhat arbitrary costs in abstract units, based on number of expected random page fetches, sequential I/O page fetches, tuples processed in memory, etc.