why not take a look at postgresql ? (9): parsing, rewriting, planning and explaining

August 2, 2012 — Leave a comment

postgresql, as well as oracle, heavily depends on accurate statistics for being able to provide the best execution plan for a given query ( both use a cost based optimizer ).

in postgresql the components involved for creating and executing the best execution plan are:

a, from an oracle perspective, special feature of the rewriter is, that you can create custom rules.
a simple example for a table containing three rows:

sysdba@[local]:1540/dbs200# create table t1 ( a integer, b char(20) );
CREATE TABLE
sysdba@[local]:1540/dbs200*# insert into t1 values ( 1, 'text1' );
INSERT 0 1
sysdba@[local]:1540/dbs200*# insert into t1 values ( 2, 'text2' );
INSERT 0 1
sysdba@[local]:1540/dbs200*# insert into t1 values ( 3, 'text3' );
INSERT 0 1
sysdba@[local]:1540/dbs200*# COMMIT;
COMMIT

let’s say the second row is such important that you do not want to allow changes to it. to achieve this you could create a trigger or you may create a rule:

sysdba@[local]:1540/dbs200# create rule myrule as on update to t1 where old.a = 2 do instead nothing;
CREATE RULE
sysdba@[local]:1540/dbs200# update t1 set b = 'blabla' where a=2;
UPDATE 0
sysdba@[local]:1540/dbs200*# select * from t1 where a=2;
 a |          b           
---+----------------------
 2 | text2               
(1 row)

such rules can be created for insert, update and delete statements and add additional conditions to statements on the tables in question. check the documentation for a complete description on this.

the optimizer in general does the same job as the optimizer in oracle does, except that you can not give hints to the optimizer in postgresql.

of course you can create indexes in postgresql for optimizing access to the data of interest. postgres provides four types of indexes:

  • B-tree: same as in oracle
  • hash: uses hash tables and is only available for “=” operations
  • Generalzied Search Tree ( GiST ): a special type often used for geometric or full text search purposes
  • Generalized Inverted Index ( GIN ): an index used for lists and arrays
    • a more detailed view on indexes in another post.

      as mentioned above the planner/optimizer is responsible for creating the best possible execution plan. a plan internally is a tree which consists of several sub plans. as with oracle there are different choices the planner/optimizer has available:

      • sequential scan: read the whole table and check each row for a match, same as in oracle
      • indexscan: read the index for matches and go to the table for reading the row. it is important to know that postgres always has to visit the table to check if a row is allowed to see in the current transaction ( no index only access possible ). this is because postgres saves the “undo” data in the table itself but not in the index
      • bitmap index scan: scans the index for matches, save the results in a bitmap in memory, sorts the bitmap in the order of the table ( that is sort by block numbers ) and than reads the table. this suppresses jumping between the index and the table all the time.
      • nested loop join: scan the outer table and then go the inner table, same as in oracle
      • hash join: creates a hash table from one the of the joined tables and then uses the hash values for searching the other table(s), same as in oracle
      • merge join: first sorts the joined tables ( depending on the join ) and then reads the tables in parallel

      in oracle you can either use the explain command, the dbms_xplan package, autotrace or give hints to the optimizer for displaying explain plans. in postgresql it’s the explain command. using the same table from above to display the explain plan for a simple select you would do:

      sysdba@[local]:1540/dbs200*# explain select * from t1;
                            QUERY PLAN                      
      ------------------------------------------------------
       Seq Scan on t1  (cost=0.00..16.90 rows=690 width=88)
      (1 row)
      sysdba@[local]:1540/dbs200*# 
      

      as I know the table only contains three rows, the 690 rows reported above a far from being correct. same issue as in oracle: if the statistics are not good, the plans will not be good, too.
      the cost reporting is special in postgres:
      “cost=0.00..16.90” means: 0 for the start costs ( as this is a sequential scan the results can be send immediately as results are retrieved ) and 16.9 for the end costs ( that is the whole cost for executing the plan ).
      the width column reports the size of one result, so in total there would be 690 rows * 88 bytes = 59840 bytes ( depending on the current statistics ).

      lets check the statistics of the table:

      sysdba@[local]:1540/dbs200*# select relpages,reltuples from pg_class where relname = 't1';
       relpages | reltuples 
      ----------+-----------
              0 |         0
      (1 row)

      obviously wrong, that’s why the statistics reported by explain above are wrong, too. these numbers should change if statistics are generated:

      sysdba@[local]:1540/dbs200*# analyze verbose t1;
      INFO:  analyzing "public.t1"
      INFO:  "t1": scanned 1 of 1 pages, containing 3 live rows and 0 dead rows; 3 rows in sample, 3 estimated total rows
      ANALYZE
      sysdba@[local]:1540/dbs200*# select relpages,reltuples from pg_class where relname = 't1';
       relpages | reltuples 
      ----------+-----------
              1 |         3
      (1 row)
      

      much better. what does explain report now ?:

      sysdba@[local]:1540/dbs200*#  explain select * from t1;
                          QUERY PLAN                     
      ---------------------------------------------------
       Seq Scan on t1  (cost=0.00..1.03 rows=3 width=25)
      (1 row)
      

      much better, too. wonder how the costs get calculated ? as in oracle oracle the costs are depended on the cost of a disk read on the cost for the cpu to process the rows. in postgresql there are two parameters which specify this:

      sysdba@[local]:1540/dbs200*# show seq_page_cost;
       seq_page_cost 
      ---------------
       1
      (1 row)
      sysdba@[local]:1540/dbs200*# show cpu_tuple_cost;
       cpu_tuple_cost 
      ----------------
       0.01
      (1 row)
      

      the number reported here are the default values. of course you can tweak these by specifying the parameters in the server parameter file.

      so the costs are: ( 1 * 1 ) + ( 3 * 0.01 ) = 1.03 ( one page read + three times 0.01 for processing the three rows ).

      as plans might lie because they are based on assumptions and statistics you can use “explain analyze” ( which really executes the plan ) for comparing the calculated costs against the real costs ( in oracle you can do this by passing the gather_plan_statistics hint and calling the dbms_xplan.display function with the ‘ADVANCED’ parameter ):

      sysdba@[local]:1540/dbs200*# explain analyze select * from t1;
                                               QUERY PLAN                                          
      ---------------------------------------------------------------------------------------------
       Seq Scan on t1  (cost=0.00..1.03 rows=3 width=88) (actual time=0.011..0.026 rows=3 loops=1)
       Total runtime: 0.150 ms
      (2 rows)
      

      the value reported for loops is only interesting for joins as this reports how often a particular step was executed.

      what about indexes ? let’s generate some more data for the t1 table and create an index:

      sysdba@[local]:1540/dbs200# insert into t1 select * from t1 where a = 1;
      INSERT 0 1
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 2
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 4
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 8
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 16
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 32
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 64
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 128
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 256
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 512
      sysdba@[local]:1540/dbs200*# insert into t1 select * from t1 where a = 1;
      INSERT 0 1024
      sysdba@[local]:1540/dbs200*# COMMIT;
      COMMIT
      sysdba@[local]:1540/dbs200# CREATE INDEX I1 ON T1(A);
      CREATE INDEX
      sysdba@[local]:1540/dbs200*# COMMIT;
      COMMIT
      sysdba@[local]:1540/dbs200# \d t1
               Table "public.t1"
       Column |     Type      | Modifiers 
      --------+---------------+-----------
       a      | integer       | 
       b      | character(20) | 
      Indexes:
          "i1" btree (a)
      Rules:
          myrule AS
          ON UPDATE TO t1
         WHERE old.a = 2 DO INSTEAD NOTHING
      

      let’s see what happens if we query for a=1 now:

      sysdba@[local]:1540/dbs200*# explain analyze select * from t1 where a = 1;
                                                   QUERY PLAN                                             
      ----------------------------------------------------------------------------------------------------
       Seq Scan on t1  (cost=0.00..40.62 rows=2048 width=25) (actual time=0.030..6.873 rows=2048 loops=1)
         Filter: (a = 1)
       Total runtime: 12.110 ms
      (3 rows)
      

      as expected the index is not used ( too many rows will match the criteria a=1 ). let’s check the table statistics and see if the cost are as expected:

      sysdba@[local]:1540/dbs200*# select relpages,reltuples from pg_class where relname = 't1';
       relpages | reltuples 
      ----------+-----------
             15 |      2050
      

      so now there are 15 pages with 2050 rows: ( 1 * 15 ) + ( 2050 * 0.01 ) = 35.5
      not really what is expected, the cost is reported higher by explain. that’s because another parameter comes into the game when there is a where clause:

      sysdba@[local]:1540/dbs200*# show cpu_operator_cost;
       cpu_operator_cost 
      -------------------
       0.0025
      (1 row)
      

      again, the whole table will be scanned, each result will be processed and additionally each result will be checked against the condition ( this is the cpu_operator_cost ), so:
      ( 1 * 15 ) + ( 2050 * 0.01 ) + ( 2050 * 0.0025 ) = 40.6250 which is almost the cost reported above.

      what happens if the index will get used ?:

      sysdba@[local]:1540/dbs200# explain analyze verbose select * from t1 where a = 3;
                                                        QUERY PLAN                                                   
      ---------------------------------------------------------------------------------------------------------------
       Index Scan using i1 on public.t1  (cost=0.00..8.27 rows=1 width=25) (actual time=0.020..0.023 rows=1 loops=1)
         Output: a, b
         Index Cond: (t1.a = 3)
       Total runtime: 0.060 ms
      (4 rows)
      

      this are two random reads ( one for the index one for the table ) which is 8, which comes from the random_page_cost parameter:

      sysdba@[local]:1540/dbs200*# show random_page_cost;
       random_page_cost 
      ------------------
       4
      (1 row)
      

      so in total it is 8 plus the overhead for the cpu.
      when it comes to joins the procedure is the same as for every other cost based rdbms: make sure the statements perform well ( and produce the right plan ) for every single table joined in the statement. if this is fine, the join will be fine, too.

      for playing with different settings postgresql provides some parameters which can be set dynamically. but be careful: as with the hints you can give to the oracle optimizer these parameters should not be used to permanently fix your plans. if the plans are wrong, check your statistics, and even more important: know your data. know your data. know your data.

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.