Archives For November 30, 1999

the previous post linked to the first sql implementation called system r. this system was based on a paper called A Relational Model of Data for Large Shared Data Banks by E.F.Codd which introduced the main concepts for relational database systems.

this is definitely worth reading as it introduced a lot of terms and basics still valid today.

system r

February 3, 2013 — Leave a comment

if you are interested in where todays databases are based on you might interested in the following link:

A History and Evaluation of System R

for example these requirements ( taken from the link above ) have been defined back in the 1970ies and are still valied today:

  1. To provide a high-level, nonnavigational user interface for maximum user productivity and data independence
  2. To support different types of database use including programmed transactions, ad hoc queries, and report genereration
  3. To support a rapidly changing database environment, in which tables, indexes, views, transactions, and other objects could easily be added to and removed from the database without stopping the system.
  4. To support a population of many concurrent users, with mechanisms to protext the integrity of the database in a concurrent-update environment
  5. To provide a means of recovering the contents of the database to a consistent state after a failure of hardware or software
  6. To provide a flexible mechanism whereby diffent views of stored data can be defined and various users can be authorized to query and update these views
  7. To support all of the above functions with a level of performance comparable to exsiting lower-function database systems.

as it is with oracle postgresql relies on various statistics to produce the explain plan for a query. the catalog table which stores the statistical information is pg_statistic.

for the purpose of this post let’s play with a simple table and see what gets stored in the pg_statistic table.

drop table if exists t1;
create table t1 ( a int );

I will always use the same query for producing the output from the pg_statistics table:

select pa.attname
     , ps.stawidth
     , ps.stadistinct
     , ps.stanullfrac
     , ps.stavalues1
     , ps.stavalues2
     , ps.stavalues3
     , ps.stavalues4
     , ps.stavalues5
  from pg_class pc
     , pg_statistic ps
     , pg_attribute pa
 where pc.relname = 't1'
   and pc.relowner = ( select nspowner 
                         from pg_namespace
                        where nspname = 'public' )
   and pc.oid = ps.starelid
   and pa.attnum = ps.staattnum
   and pa.attrelid = pc.oid;

If you run the query right now it will not return any rows as pg_statistics gets populated once you run analyze and at least one row is inserted into the table ( but not before ).

--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------+------------+------------+------------+------------
(0 rows)

So let’s create a row and see what’s happening:

insert into t1 (a) values (1);
analyze t1;
--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------+------------+------------+------------+------------
 a       |        4 |          -1 |           0 |      |      |      |      | 
(1 row)

what does this tell ?

  • the column “a” is reported to have a width of 4 which corresponds to the definition of the integer data-type.
  • distinct values is reported as “-1” which seems a little but surprising for the moment as it should be 1, shouldn’t it ? in fact this is equal to one as it tells that there is statistically one occurrence for the value
  • the number of null values is zero, of course

let’s create another row and have a look at the results:

insert into t1 (a) values (2);
analyze t1;
--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------+------------+------------+------------+------------
 a       |        4 |          -1 |           0 | {1,2}      |      |      |      | 

the only column that changed is the stavalues1 column. as soon as there is more than one row in the table this column gets populated. distinct values is still reported as “-1” as each row in the table has a different value. this will not change as long as you insert unique values and not too much of them ( more on his later ):

insert into t1 (a) values (generate_series(3,10));
analyze t1;
--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac |       stavalues1       | stavalues2 | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------------------+------------+------------+------------+------------
 a       |        4 |          -1 |           0 | {1,2,3,4,5,6,7,8,9,10} |      |      |      | 
(1 row)

once you insert the same values again the stadistinct column gets updated to reflect the change in the data distribution:

insert into t1 (a) values (generate_series(1,10));
analyze t1;
--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac |       stavalues1       | stavalues2 | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------------------+------------+------------+------------+------------
 a       |        4 |        -0.5 |           0 | {1,2,3,4,5,6,7,8,9,10} |      |      |      | 

now there is a value of “-0.5” which essentially tells that there are about 2 entries for each distinct value. we can do the same again and the stadistinct value decreases again to match data ( around three occurrences for each distinct value ):

insert into t1 (a) values (generate_series(1,10));
analyze t1;
--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac |       stavalues1       | stavalues2 | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------------------+------------+------------+------------+------------
 a       |        4 |   -0.333333 |           0 | {1,2,3,4,5,6,7,8,9,10} |      |      |      | 
(1 row)

as it looks right now postgresql stores all the values of a column another time in the pg_statistic catalog table. as this would be bad practice let’s if this changes once we insert more data:

insert into t1 (a) values (generate_series(11,10000));
analyze t1;
--exec statistics-query from above:
 attname | stawidth | stadistinct | stanullfrac |       stavalues1       |        stavalues2    | stavalues3 | stavalues4 | stavalues5 
---------+----------+-------------+-------------+------------------------+----------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------+---
---------+------------+------------
 a       |        4 |   -0.998004 |           0 | {1,2,3,4,5,6,7,8,9,10} | {11,110,210,310,410,510,610,710,810,910,1009,1109,1209,1309,1409,1509
,1609,1709,1809,1908,2008,2108,2208,2308,2408,2508,2608,2708,2807,2907,3007,3107,3207,3307,3407,3507,3607,3706,3806,3906,4006,4106,4206,4306,440
6,4506,4605,4705,4805,4905,5005,5105,5205,5305,5405,5504,5604,5704,5804,5904,6004,6104,6204,6304,6403,6503,6603,6703,6803,6903,7003,7103,7203,73
02,7402,7502,7602,7702,7802,7902,8002,8102,8201,8301,8401,8501,8601,8701,8801,8901,9001,9100,9200,9300,9400,9500,9600,9700,9800,9900,10000} |      |      | 
(1 row)

now we get the next stavalues column populated in steps of a hundred. the first stavalues columns remains unchanged in this example.

in oracle words the stavaluesN columns are similar to a histogram. it’s a statistical representation of the data distribution which is then used to generate execution plans. more to come …

just noticed another difference between postgres and oracle:

oracle:

SQL> create table t1 ( a number(10,2 ) );
Table created.
SQL> create view v1 as select a from t1;
View created.
SQL> alter table t1 modify ( a number(11,2));
Table altered.

postgresql:

[postgres] > create table t1 ( a numeric(10,2) );
CREATE TABLE
[postgres] > create view v1 as select a from t1;
CREATE VIEW
[postgres] > alter table t1 alter column a type numeric(11,2);
ERROR:  cannot alter type of a column used by a view or rule
DETAIL:  rule _RETURN on view v1 depends on column "a"

the same is true if you want to drop a table which has a view defined on it:
oracle:

SQL> drop table t1;
Table dropped.

postgres:

[postgres] > drop table t1;
ERROR:  cannot drop table t1 because other objects depend on it
DETAIL:  view v1 depends on table t1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

just noticed that you can disable table locks and thus prevent ddl on a table in oracle.

create table t1 ( a number );
alter table t1 disable table lock;

according to the documentation this should prevent all ddls on this table. let’s see if it works:

alter table t1 modify ( a number(1,0 ));
alter table t1 modify ( a number(1,0 ))
*
ERROR at line 1:
ORA-00069: cannot acquire lock -- table locks disabled for T1

works if I try to modify a column definition. can you drop the table ?

drop table t1;
drop table t1
           *
ERROR at line 1:
ORA-00069: cannot acquire lock -- table locks disabled for T1

ok, for dropping, too. truncating ?

truncate table t1;
truncate table t1
               *
ERROR at line 1:
ORA-00069: cannot acquire lock -- table locks disabled for T1

what about adding a column ?

alter table t1 add ( b number );
Table altered.

hm…not sure if this is a bug, but think so ( this is 11.2.0.3 ). and an alter statement for sure is ddl.

on a 10.2.0.4 this does not work:

select version from v$instance;
VERSION
-----------------
10.2.0.4.0
alter table t1 add ( b number );
alter table t1 add ( b number )
*
ERROR at line 1:
ORA-00069: cannot acquire lock -- table locks disabled for T1

when working with oracle there is exactly one numeric data type one may use for storing numbers ( and that’s number ). in postgresql there are ten. what happened when we migrated an oracle database to postgresql was, that a source table in oracle took much less space than the same table in postgresql. thanks to some people on the pgsql-performance mailing list one of the reasons for this was, that the wrong numeric data type was chosen for storing the integers in postgresql.

a simple test-case:

drop table datatype1;
drop table datatype2;
create table datatype1
( a numeric ( 8,0 )
, b numeric ( 8,0 )
);
insert into datatype1 ( a, b )
       values ( generate_series ( 1, 10000000 )
              , generate_series ( 1, 10000000 )
              );
create index idatatype1_1 on datatype1 ( a );
create index idatatype1_2 on datatype1 ( b );
create table datatype2
( a int
, b int
);
insert into datatype2 ( a, b )
       values ( generate_series ( 1, 10000000 )
              , generate_series ( 1, 10000000 )
             );
create index idatatype2_1 on datatype2 ( a );
create index idatatype2_2 on datatype2 ( b );
analyze verbose datatype1;
analyze verbose datatype2;
select pg_size_pretty ( pg_relation_size ( 'datatype1' ) );
 pg_size_pretty 
----------------
 422 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'datatype2' ) );
 pg_size_pretty 
----------------
 346 MB
(1 row)

for these two little tables the difference is about 76mb. depending on the statements this is 76mb more that needs to be scanned and this can have great impacts on performance. surprisingly, at least for me, this is not true for the indexes:

select pg_size_pretty ( pg_relation_size ( 'idatatype1_1' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'idatatype2_1' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'idatatype2_1' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'idatatype2_2' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)

so, it’s worth to keep an eye on which data types to use for numbers …

tricky situation: planned to apply a patch to an oracle home and opatch failes with:

OPatch failed to locate Central Inventory.
Possible causes are: 
    The Central Inventory is corrupted
    The oraInst.loc file specified is not valid.
LsInventorySession failed: OPatch failed to locate Central Inventory.
Possible causes are: 
    The Central Inventory is corrupted
    The oraInst.loc file specified is not valid.

obviously the first thing to check is if the pointer ( oraInst.loc ) points to the correct location and the inventory group is listed correctly:

cat /etc/oraInst.loc 
inventory_loc=/opt/oracle/oraInventory
inst_group=oinstall

this is what it should look like assuming oinstall is really the inventory group and your inventory location is really /opt/oracle/oraInventory. so this part is fine but somehow the oraInventory got lost:

ls -la /opt/oracle/
total 312
drwxrwx--- 4 grid   oinstall   4096 Sep 19 21:54 .
drwxr-xr-x 4 root   root       4096 Mar 23 08:52 ..
drwxrwx--- 4 grid   oinstall   4096 Mar 23 08:43 product
drwxrwx--- 3 grid   oinstall   4096 May  3 23:52 stage

what to do ? restore it from backup would be one solution ( you do regular backups of your inventory, don’t you ? ). let’s assume no backup is available. you still may restore the inventory using the oracle installer available in the oracle home you want to register:

ls -la $ORACLE_HOME/oui
total 60
drwxr-xr-x  8 oracle oinstall 4096 Mar 26 14:23 .
drwxrwx--- 76 oracle oinstall 4096 May  3 23:17 ..
-rwxr-xr-x  1 oracle oinstall  323 Feb 17  2007 admin_langs.xml
drwxr-xr-x  3 oracle oinstall 4096 Mar 26 14:23 bin
-rw-r-----  1 oracle oinstall 6680 Mar 26 14:20 clusterparam.ini
drwxr-xr-x  2 oracle oinstall 4096 Mar 26 14:20 instImages
drwxr-xr-x  4 oracle oinstall 4096 Mar 26 14:20 jlib
drwxr-xr-x  3 oracle oinstall 4096 Mar 26 14:23 lib
-rwxr-xr-x  1 oracle oinstall 2110 Feb 17  2007 nlsrtlmap.xml
-rw-r-----  1 oracle oinstall 6782 Mar 26 14:20 oraparam.ini
drwxr-xr-x  3 oracle oinstall 4096 Mar 26 14:16 prov
-rwxr-xr-x  1 oracle oinstall  748 Feb 17  2007 runtime_langs.xml
drwxr-xr-x  2 oracle oinstall 4096 Mar 26 14:20 schema

even if the inventory is completely lost for recreating it a simple call to the oracle installer will recreate it and register the ORACLE_HOME:

$ORACLE_HOME/oui/bin/runInstaller -silent -attachHome ORACLE_HOME=$ORACLE_HOME ORACLE_HOME_NAME=H11203

if successful you should see the following output:

Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB.   Actual 2000 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /opt/oracle/oraInventory
'AttachHome' was successful.

… and opatch will succeed from now on.

another ouch with GI 11.2.0.3 on solaris 10 sparc 64bit: if one of your cluster nodes restarts and you can not find any evident reason for it despite some of these entries in the logs:

[cssd(1084)]CRS-1612:Network communication with node1 node (1) missing for 50% of timeout interval. Removal of this node from cluster in 14
.258 seconds
[cssd(1084)]CRS-1625:Node node1, number 1, was manually shut down
[cssd(1084)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node2 .
[ctssd(1117)]CRS-2407:The new Cluster Time Synchronization Service reference node is host node2.
[crsd(1522)]CRS-5504:Node down event reported for node 'node1'.
[cssd(1084)]CRS-1601:CSSD Reconfiguration complete. Active nodes are node1 node2 .

… and:

[ CSSD][20](:CSSNM00018:)clssnmvDiskCheck: Aborting, 0 of 1 configured voting disks available, need 1
[ CSSD][20]###################################
[ CSSD][20]clssscExit: CSSD aborting from thread clssnmvDiskPingMonitorThread
[ CSSD][20]###################################
[ CSSD][20](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
[ CSSD][20]

you probably hit bug 13869978. this seems only to happen if you are on external redundancy for the cluster diskgroup and therefore only one voting disk was created.

two solutions are available:

  • migrate the votings disk to an asm mirrored diskgroup ( normal or high redundancy )
  • or apply PSU4 on top of 11.2.0.3

there seems to be the same issue on linux.

this is work in progress, but shall show the similarities and differences between postgresql and oracle in regards to implementing schemas and code. I will try to add more and more things in the future and update this post and the samples accordingly. documentation in the scripts is not very well at the moment but it should be enough to start.

for now the samples include:

  • tables: standard columns and arrays
  • constraints: primary keys, foreign keys, check constraints
  • triggers
  • sequences
  • indexes
  • views
  • loading blobs / clobs
  • plsql packages -> pgsql
  • materialzed views
  • partitioning
  • anonymous plsql / pgplsql blocks

after all these postgresql posts I thought it’s time to look at same really cool features postgresql offers but oracle lacks. of course oracle has plenty of features other databases don’t provide, but this is true the other way round, too.

psql – sqlplus

the more I use the psql utility ( postgresql’s equivalent to sqlplus ) the more I love it. this tiny little tool has so many wonderful features that it is hard to give a complete overview. so, here are my favorites:

one of the best features psql offers are the various shortcuts one can use to query the catalog ( data dictionary ), control the output, display help for the various commands and move data in and out of the database.

first example: to list the available views in oracle you have to query the data dictionary ( either dict or v$fixed_view ). in psql, it’s as easy like this:

postgres=# \dvS
                       List of relations
   Schema   |              Name               | Type |  Owner   
------------+---------------------------------+------+----------
 pg_catalog | pg_available_extension_versions | view | postgres
 pg_catalog | pg_available_extensions         | view | postgres
 pg_catalog | pg_cursors                      | view | postgres
...

if you want even more information ( size and description in this example ) a “+” can always be appended:

postgres=# \dvS+
                                   List of relations
   Schema   |              Name               | Type |  Owner   |  Size   | Description 
------------+---------------------------------+------+----------+---------+-------------
 pg_catalog | pg_available_extension_versions | view | postgres | 0 bytes | 
 pg_catalog | pg_available_extensions         | view | postgres | 0 bytes | 
 pg_catalog | pg_cursors                      | view | postgres | 0 bytes | 

you can even use wildcards if you know parts of an object name but are not sure about the exact name:

postgres=# \dvS *index* 
                   List of relations
   Schema   |          Name          | Type |  Owner   
------------+------------------------+------+----------
 pg_catalog | pg_indexes             | view | postgres
 pg_catalog | pg_stat_all_indexes    | view | postgres
 pg_catalog | pg_stat_sys_indexes    | view | postgres
 pg_catalog | pg_stat_user_indexes   | view | postgres
 pg_catalog | pg_statio_all_indexes  | view | postgres
 pg_catalog | pg_statio_sys_indexes  | view | postgres
 pg_catalog | pg_statio_user_indexes | view | postgres
(7 rows)

the same is true for tables (\dt), functions (\df), tablespaces (\db) and all the other objects available. no need to create scripts for querying frequent used information.

another big plus is the integrated help. let’s assume you are not sure about how to exactly create an index. perhaps you do not need that so often that you remember the syntax. no need to search the documentation:

postgres-# \h CREATE INDEX        
Command:     CREATE INDEX
Description: define a new index
Syntax:
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ name ] ON table [ USING method ]
    ( { column | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
    [ WITH ( storage_parameter = value [, ... ] ) ]
    [ TABLESPACE tablespace ]
    [ WHERE predicate ]

that’s really cool.

editing functions directly ? not a problem with psql. let’s create a simple function ( this one is from the documentation ):

CREATE FUNCTION add(integer, integer) RETURNS integer
    AS 'select $1 + $2;'
    LANGUAGE SQL
    IMMUTABLE
    RETURNS NULL ON NULL INPUT;

if you know want to directly edit this function, just do:

postgres=# \ef add(integer, integer)

… change it, save it, execute it and you’re done.

what about getting data out of postgresql ? maybe there’s a requirement to load data to a data warehouse which is a database from another vendor. flat files almost always provide a robust way for transporting data. and this is pretty easy and very well integrated with psql. the command in question is the “copy” command.
exporting a table to a file is not a big deal:

postgres=# copy myschema.customers to '/tmp/customers.log';                 
COPY 20000
postgres=# \! head /tmp/customers.log
1	VKUUXF	ITHOMQJNYX	4608499546 ABC Way	\N	QSDPAGD	SD	24101	US	1	ITHOMQJNYX@abc.com	4608499546	11979279217775911	2012/03	user1	password	55	100000	M

that’s it ( in the simplest way ). there are some more switches ( csv, headers, delimiter, etc. ) to fine tune your export, just use the integrated help to see what’s around:

\h copy

loading data back into postgresql ? same command, the other way around:

postgres=# create table myschema.customers2 ( like myschema.customers );  
CREATE TABLE
postgres=# copy myschema.customers2 from  '/tmp/customers.log';
COPY 20000
postgres=# select * from myschema.customers2 limit 1;
 customerid | firstname |  lastname  |      address1       | address2 |  city   | state |  zip  | country | region |        email        |  
 phone    | creditcardtype |    creditcard    | creditcardexpiration | username | password | age | income | gender 
------------+-----------+------------+---------------------+----------+---------+-------+-------+---------+--------+---------------------+--
----------+----------------+------------------+----------------------+----------+----------+-----+--------+--------
          1 | VKUUXF    | ITHOMQJNYX | 4608499546 ABC Way |          | QSDPAGD | SD    | 24101 | US      |      1 | ITHOMQJNYX@abc.com | 4
608499546 |              1 | 1979279217775911 | 2012/03              | user1    | password |  55 | 100000 | M
(1 row)

easy, isn’t it?

and by the way: tired of writing “select * from some_table” all the time ? use the “table” command to query a table:

postgres=# table myschema.customers;
 customerid | firstname |  lastname  |      address1       | address2 |  city   | state |  zip  |   country    | region |        email      
  |   phone    | creditcardtype |    creditcard    | creditcardexpiration | username  | password | age | income | gender 
------------+-----------+------------+---------------------+----------+---------+-------+-------+--------------+--------+-------------------
--+------------+----------------+------------------+----------------------+-----------+----------+-----+--------+--------

if you are used to bash or some other shells which provide similar functionality you for sure use the command history ( arrow up and down ). it’s integrated with psql, too, out of the box ( yes, I know you may use rlwrap with sqlplus, but you still have to do some extra work for getting this to work ). and as the various shells have their startup control file, there is one for psql, too, which usually is located in the home directory of the os user and is called “.psqlrc”. like the login.sql and glogin.sql files in oracle you can define your setup here. but you can do even more. psql provides the ability to define variables, e.g. :

\set waits 'SELECT pg_stat_activity.procpid, pg_stat_activity.current_query, pg_stat_activity.waiting, now() - pg_stat_activity.query_start  as "totaltime", pg_stat_activity.backend_start FROM pg_stat_activity WHERE pg_stat_activity.current_query !~ \'%IDLE%\'::text AND pg_stat_activity.waiting = true;;'

…defines a variable which contains a sql statement for displaying current waits in the database. once defined you can easily reference it:

postgres=# :waits
 procpid | current_query | waiting | totaltime | backend_start 
---------+---------------+---------+-----------+---------------
(0 rows)

put this in your “.psqlrc” file and you’ll have your variable available all the time. really cool.

indexing

in postgresql there is the concept of a partial index. that means you can create an index on a subset of a table’s data. this is not possible in oracle. let’s do an example:

assume we a have a table which contains an ever increasing number, an entry for each hour of the year and a true/false flag for each row ( postgresql allows columns to be defined as boolean, cool ):

create table t1 ( a integer, b timestamp with time zone, c boolean ); 

before creating a partial index let’s populate the table with some test-data. this also introduces the generate_series function which is a very easy and effective way to generate some data:

insert into t1 ( a, b, c )
       values ( generate_series ( 1, 8761 )
              , generate_series ( timestamptz ( to_date('01.01.2012','DD.MM.YYYY') ) 
                                , timestamptz ( to_date('31.12.2012','DD.MM.YYYY') )
                                , interval '1h' 
                                )
              , 'T'
              );
update t1 set c = 'F' where mod(a,111) = 0;

now, assume there is a report which runs at the end of every month which is only interested in data which has the false flag set on column c ( maybe to to get all customers who did not pay their receipt :) ). you could create a normal index on column c, but you could also create a partial index for this:

create index i1 on t1 ( c ) where not c;

this will:
a) greatly reduce the size of the index
b) only index the data which fulfills the expression
c) exactly provides the data the report ask for

let’s see what explain tells about the statement the report uses:

indx=# analyze verbose t1;
INFO:  analyzing "public.t1"
INFO:  "t1": scanned 57 of 57 pages, containing 8761 live rows and 0 dead rows; 8761 rows in sample, 8761 estimated total rows
ANALYZE
indx=# explain analyze select * from t1 where not c;
                                                QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Index Scan using i1 on t1  (cost=0.00..12.85 rows=78 width=13) (actual time=0.016..0.056 rows=78 loops=1)
   Index Cond: (c = false)
 Total runtime: 0.098 ms
(3 rows)

exactly what I wanted. so, if you know the statements in your database and you know your data ( and you probably should :) ) partial indexes may provide a great opportunity.

granting and revoking

for sure you had the situation where you needed to grant select on all tables in a schema to another user. in oracle one would create a role, grant select for every single table in the source schema to that role and then grant the role to the target user ( if you do not want to grant to public, which is a bad idea anyway ).

in postgresql this is much easier. let’s setup a simple test case:

postgres=# create role usr1 login password 'usr1';
CREATE ROLE
postgres=# create role usr2 login password 'usr2';
CREATE ROLE
postgres=# create database beer owner=usr1;
CREATE DATABASE

in oracle a schema is almost the same thing as a user. in postgresql you’ll explicitly have to create a schema. otherwise the objects will get created in the public schema:

postgres=# \c beer usr1
Password for user usr1: 
You are now connected to database "beer" as user "usr1".
beer=> create schema myschema;
CREATE SCHEMA
beer=> \dn
   List of schemas
   Name   |  Owner   
----------+----------
 myschema | usr1
 public   | postgres
(2 rows)

as the schema is available now, tables can be created in the new schema:

beer=> create table myschema.t1 ( a int );
CREATE TABLE
beer=> create table myschema.t2 ( a int );
CREATE TABLE
beer=> create table myschema.t3 ( a int );
CREATE TABLE

granting select on all the tables in the schema is as easy as:

beer=> grant usage on schema myschema to usr2;
GRANT
beer=> grant select on all tables in schema myschema to usr2;
GRANT

without the “usage” grant the user will not be able to do anything in the schema. so be sure, to grant it before granting any other privileges.

to verify it:

beer=> \c beer usr2
Password for user usr2: 
You are now connected to database "beer" as user "usr2".
beer=> select * from myschema.t1;
 a 
---
(0 rows)
beer=> select * from myschema.t2;
 a 
---
(0 rows)
beer=> select * from myschema.t3;
 a 
---
(0 rows)

you can even grant select on specific columns ( which is not possible in oracle, too ):

beer=>\c postgres postgres
postgres=# create user usr3 login password 'usr3';
CREATE ROLE
postgres=# \c beer usr1
Password for user usr1: 
You are now connected to database "beer" as user "usr1".
beer=> grant usage on schema myschema to usr3;
beer=> grant select (a) on table myschema.t1 to usr3;
GRANT
beer=> \c beer usr3
Password for user usr3: 
You are now connected to database "beer" as user "usr3".
beer=> select a from myschema.t1;
 a 
---
(0 rows)

this can be very handy if you want to hide some columns or give select to just a few and do not want to create views on top of the table.

creating objects

creating objects ? you might think there is nothing special here. but wait :)
in oracle, if you’d like to create a table which has exactly the same columns as another table you’d do something like this:

create table new_table as select * from source_table where 1 = 2;

in postgresql you you can do the same:

d1=# create table t1 ( a int, b int );
CREATE TABLE
d1=#  create table t2 as select * from t1 where 1=2;
SELECT 0
d1=# \d t2
      Table "public.t2"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | integer | 

but postgresql goes some steps further. for creating a similar table there is the “like” keyword. in the easiest way you would do:

d1=# create table t3 ( like t1 );
CREATE TABLE
d1=# \d t3
      Table "public.t3"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | integer | 

nothing special so far. just an easier to understand and shorter syntax ( in my opinion ). but wait, check this out:

d1=# alter table t1 add constraint uk1 unique (a);
NOTICE:  ALTER TABLE / ADD UNIQUE will create implicit index "uk1" for table "t1"
ALTER TABLE
d1=# \d t1
      Table "public.t1"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | integer | 
Indexes:
    "uk1" UNIQUE CONSTRAINT, btree (a)
d1=# create table t4 ( like t1 INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING COMMENTS );
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "t4_a_key" for table "t4"
CREATE TABLE
d1=# \d t4
      Table "public.t4"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | integer | 
Indexes:
    "t4_a_key" UNIQUE CONSTRAINT, btree (a)

that’s cool. get an exact copy of the table’s definition with one command. you think there can’t be more goodies ? what about this? :

d1=# create table t5 ( a int, b int );
CREATE TABLE
d1=# create table t6 ( c date, d date );
CREATE TABLE
d1=# create table t7 ( a int, b int, c date, d date ) inherits ( t5, t6 );
NOTICE:  merging column "a" with inherited definition
NOTICE:  merging column "b" with inherited definition
NOTICE:  merging column "c" with inherited definition
NOTICE:  merging column "d" with inherited definition
CREATE TABLE
d1=# \d t7
      Table "public.t7"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | 
 b      | integer | 
 c      | date    | 
 d      | date    | 
Inherits: t5,
          t6

what the hell is this ? some sort of object oriented mechanism ? let’s add some data and check the results:

d1=# insert into t7 values ( 1,1,current_date,current_date);
INSERT 0 1
d1=# select * from t5;
 a | b 
---+---
 1 | 1
(1 row)
d1=# select * from t6;
     c      |     d      
------------+------------
 2012-08-22 | 2012-08-22
(1 row)
d1=# select * from t7;
 a | b |     c      |     d      
---+---+------------+------------
 1 | 1 | 2012-08-22 | 2012-08-22
(1 row)

as you can see the tables are now dependent on each other. if data is added into table t7, the data is automatically present in the underlying t5 and t6 tables. but this does not work the other way around:

d1=# insert into t5 values (2,2);
INSERT 0 1
d1=# insert into t6 values ( current_date,current_date);
INSERT 0 1
d1=# select * from t7;
 a | b |     c      |     d      
---+---+------------+------------
 1 | 1 | 2012-08-22 | 2012-08-22
(1 row)

what happens if you change the column definitions ?

d1=# alter table t5 alter column a type bigint;
ALTER TABLE
d1=# \d t5
      Table "public.t5"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | bigint  | 
 b      | integer | 
Number of child tables: 1 (Use \d+ to list them.)
d1=# \d t7
      Table "public.t7"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | bigint  | 
 b      | integer | 
 c      | date    | 
 d      | date    | 
Inherits: t5,
          t6

automatically gets propagated. really nice … I don’t have a use case for this currently, but maybe this can be useful to reduce a table’s size into smaller tables depending on the columns. would like to hear if someone has used this feature and for what purpose. feel free to post comments.

time for the last one ( for now :) ). I am pretty sure plenty of databases use sequences to generate primary keys. in postgresql you can set the default for a table’s column to get populated from a specific sequence ( not possible in oracle ):

d1=# create sequence s1;
CREATE SEQUENCE
d1=# create table t8 ( a int primary key default nextval('s1'), b int );
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t8_pkey" for table "t8"
CREATE TABLE
d1=# insert into t8 (b) values (1);
INSERT 0 1
d1=# select * from t8;
 a | b 
---+---
 1 | 1
(1 row)

for sure there are lots and lots of more cool features, but for now this shall be enough …

UPDATE: Check this post for a use case for table inheritance in the sample scripts.