updates in postgresql

March 15, 2013 — Leave a comment

if you read the title you might think there is not much to say about. but if you come from the oracle world there is actually a huge difference in how postgresql handles updates in contrast to oracle.

let’s start with a simple table:

create table t1 ( a int );
insert into t1 (a) values (1);
insert into t1 (a) values (2);
insert into t1 (a) values (3);

for pointing out the differences I’ll use an extension called pageinspect. for making use of this extension you’ll have to create it:

create extension pageinspect;

this will create a couple of functions which assist in inspecting database pages:

\dx+ *inspect*
    Objects in extension "pageinspect"
            Object Description            
------------------------------------------
 function bt_metap(text)
 function bt_page_items(text,integer)
 function bt_page_stats(text,integer)
 function fsm_page_contents(bytea)
 function get_raw_page(text,integer)
 function get_raw_page(text,text,integer)
 function heap_page_items(bytea)
 function page_header(bytea)

coming back to the test table which contains three rows ( or tuples in postgresql wording ) right now. making use of the heap_page_items and get_raw_page functions of the pageinspect extension we may display some internals of the table’s page:

select lp, lp_len, t_xmin
     , t_xmax, lp_off, t_ctid 
  from heap_page_items(get_raw_page('t1', 0));
 lp | lp_len | t_xmin | t_xmax | lp_off | t_ctid 
----+--------+--------+--------+--------+--------
  1 |     28 |   9271 |      0 |   8160 | (0,1)
  2 |     28 |   9272 |      0 |   8128 | (0,2)
  3 |     28 |   9273 |      0 |   8096 | (0,3)

not going too much into the details right now this tells us that page zero of table t1 contains three rows. this is consistent with the inserts from above. what happens if we update one of the rows ?

update t1 set a=4 where a=1;

from an oracle perspective the answer will be clear: the row in question will be read, updated and written back once committed ( although it is the whole block or page that is read and written rather than the individual row ). oracle uses the undo data to create the old row version for others if they request the row before the commit.

let’s see what postgresql did:

SELECT lp, lp_len, t_xmin
     , t_xmax, lp_off, t_ctid 
  from heap_page_items(get_raw_page('t1', 0));
 lp | lp_len | t_xmin | t_xmax | lp_off | t_ctid 
----+--------+--------+--------+--------+--------
  1 |     28 |   9271 |   9274 |   8160 | (0,4)
  2 |     28 |   9272 |      0 |   8128 | (0,2)
  3 |     28 |   9273 |      0 |   8096 | (0,3)
  4 |     28 |   9274 |      0 |   8064 | (0,4)

ups, four rows/tuples now. what happened ? the first point you might have noticed is the t_xmax that got set for the first row. t_xmin is the transaction id that did the insert of the rows/tuples. t_xmax is the transaction that will get set if a tuple/row either gets updated or deleted. this actually means that transactions after t_xmax will not see this row/tuple. this is how postgresql implements multiversion concurrency control ( mvcc ). in other words: transactions below 9274 will see the first row, transaction greater 9274 will not see the first row anymore.

what actually happened is a delete of the first row and an insert of the fourth row and that’s the difference between oracle and postgresql. in postgresql an update is a delete followed by an insert. you might wonder what happens to the deleted row as it is still present in the page. that’s were vacuum kicks in:

vacuum verbose t1;
INFO:  vacuuming "public.t1"
INFO:  "t1": found 1 removable, 3 nonremovable row versions in 1 out of 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
(sysdba@[local]:1540)[ppas92] > SELECT lp, lp_len, t_xmin
                                        , t_xmax, lp_off, t_ctid 
                                     from heap_page_items(get_raw_page('t1', 0));
 lp | lp_len | t_xmin | t_xmax | lp_off | t_ctid 
----+--------+--------+--------+--------+--------
  1 |      0 |        |        |      4 | 
  2 |     28 |   9272 |      0 |   8160 | (0,2)
  3 |     28 |   9273 |      0 |   8128 | (0,3)
  4 |     28 |   9274 |      0 |   8096 | (0,4)

vacuum takes care of the expired rows and deletes the pointers. if you did not disable the autovacuum process the database takes care of this on its own and you do not need to worry about it.
one point to remember is that the space for the deleted row only will get released if you do a vacuum full:

vacuum full verbose t1;
INFO:  vacuuming "public.t1"
INFO:  "t1": found 0 removable, 3 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
VACUUM
(sysdba@[local]:1540)[ppas92] > SELECT lp, lp_len, t_xmin
                                     , t_xmax, lp_off, t_ctid 
                                  from heap_page_items(get_raw_page('t1', 0));
 lp | lp_len | t_xmin | t_xmax | lp_off | t_ctid 
----+--------+--------+--------+--------+--------
  1 |     28 |   9290 |      0 |   8160 | (0,1)
  2 |     28 |   9291 |      0 |   8128 | (0,2)
  3 |     28 |   9292 |      0 |   8096 | (0,3)

otherwise the space will just get reused when appropriate.

Advertisements

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s