Archives For July 2012

while reading more about postgresql I came across a discussion which is valid for oracle, too: it’s about implementing hints in the postgresql optimizer. actually it is more about why the postgresql people do not want to provide a feature like this.

for developing and testing purposes there might be situations where hints are useful. but hints do not solve the issue, hints are always a workaround. I like the way the postgresql people think about this: fix the problem ( either in the application or the postgresql source code ).

check out the pros, cons and discussions:
postgresql optimizer hint discussion
Hinting at PostgreSQL

as databases tend to hold sensitive information this information should be protected as much as possible. oracle provides various tools for securing and auditing the database: database firewall, audit vault, enterprise security, database vault to name a few of them ( and for most of them you’ll need a separate license ) and of course all the privileges you can assign and revoke inside the database.

Roles, Groups and Passwords

as oracle does, postgres bases it’s internal security mechanisms on users and roles. users created in postgres are valid globally, that is: not specific to a single database. this means the amount of users is the same for all databases. privileges can also be assigned to groups, which can be granted to users. if you are used to the oracle terms be aware that:

  • users in oracle are called roles in postgres
  • roles in oracle are called groups in postgres
  • sometimes the word role is used for both, users and groups, in postgres ( that is login roles, which are users, and nologin roles, which are groups )

to create a new login role in the database the “create role” command is used:

CREATE ROLE "user1" LOGIN;
CREATE ROLE "user2" LOGIN PASSWORD 'user2';
# create a superuser
CREATE ROLE "user3" LOGIN PASSWORD 'user3' SUPERUSER;
# create a user and grant the privilege to create roles
CREATE ROLE "user4" LOGIN PASSWORD 'user4' CREATEROLE;
# create a user allowed to create databases
CREATE ROLE "user5" LOGIN PASSWORD 'user5' CREATEDB;
# create a user allowed to create databases and password validity
CREATE ROLE "user6" LOGIN PASSWORD 'user6' CREATEDB VALID UNTIL '2012-10-01';
# create a user and limit her amount of connections
CREATE ROLE "user7" LOGIN PASSWORD 'user7' CONNECTION LIMIT 2;

be careful if you create users like above and provide the password as normal string. depending on your server configuration the passwords will be visible in the server’s logfile and the psql history:

LOG:  statement: CREATE ROLE "user1" LOGIN;
LOG:  statement: CREATE ROLE "user2" LOGIN PASSWORD 'user2';
LOG:  statement: CREATE ROLE "user3" LOGIN PASSWORD 'user2' SUPERUSER;
LOG:  statement: CREATE ROLE "role";
LOG:  statement: CREATE ROLE "user4" LOGIN PASSWORD 'user4' CREATEROLE;

as postgres internally encrypts the passwords with md5 you can prevent this by providing the encrypted password when creating users:

CREATE USER someuser LOGIN PASSWORD 'md572947234907hfasf3';

to get the encrypted password out of the database use the pg_authid view:

SELECT rolname, rolpassword FROM pg_authid;
 rolname  |             rolpassword             
----------+-------------------------------------
 sysdba   | md5448a3ec0e7a2689f0866afca52f91e13
 user1    | 
 user2    | md572881e285cdb0f9370dcdf1db0d9a869
 user3    | md53b24544e8f4b2a20f4bcca02a35df8fb
 user4    | md547e1c205dd73d4c06405bd08d255e320
 user5    | md51dc34834df4da4804236eb250118fb41
 user6    | md5bdf2912fce3ee3f6657bacc65527c7bd
 user7    | md5c5068c076d70d192c7f205a9bba4c469
 role1    | 

to create a group ( or role in oracle terms ) just skip the login attribute:

CREATE ROLE "role1";

granting groups to users:

GRANT ROLE1 TO USER1;

or

GRANT ROLE1 TO USER1 WITH ADMIN OPTION;

you can either use the psql shortcut to list the roles in the database server:

\du
                             List of roles
 Role name |                   Attributes                   | Member of 
-----------+------------------------------------------------+-----------
 role1     | Cannot login                                   | {}
 sysdba    | Superuser, Create role, Create DB, Replication | {}
 user1     |                                                | {role1}
 user2     |                                                | {}
 user3     | Superuser, Replication                         | {}
 user4     | Create role                                    | {}
 user5     | Create DB                                      | {}
 user6     | Create DB                                      | {}
 user7     | 2 connections                                  | {}

… or you may use the pg_roles view:

SELECT rolname,rolsuper,rolcreatedb,rolconnlimit,rolvaliduntil FROM pg_roles;
 rolname  | rolsuper | rolcreatedb | rolconnlimit |     rolvaliduntil      
----------+----------+-------------+--------------+------------------------
 sysdba   | t        | t           |           -1 | 
 user1    | f        | f           |           -1 | 
 user2    | f        | f           |           -1 | 
 user3    | t        | f           |           -1 | 
 user4    | f        | f           |           -1 | 
 user5    | f        | t           |           -1 | 
 user6    | f        | t           |           -1 | 2012-10-01 00:00:00+02
 user7    | f        | f           |            2 | 
 role1    | f        | f           |           -1 | 

to delete a role, just drop it:

DROP ROLE ROLE1;
# or to suppress error messages in case the role does not exist 
DROP ROLE IF EXISTS ROLE1;

to delete everything owner by a specific role:

DROP OWNED BY USER1;

you can even re-assign all objects from one role to another:

REASSIGN OWNED BY USER1 TO USER2;

granting / revoking privileges on objects is similar than in oracle with a few exceptions. if you want to grant execute on a function you’ll have to specify the parameters, too:

GRANT EXECUTE ON FUNCTION1 ( int, int ) TO USER1; (

you can grant a privilege on a whole schema ( tables, sequences and functions ) :

GRANT SELECT ON ALL TABLES IN SCHEMA A TO USER2;

you can grant privileges on a whole database:

GRANT ALL PRIVILEGES ON DATABASE DBS200 TO USER2;

you can change the owner of objects:

ALTER TABLE TEST1 OWNER TO USER2;

if you want to create objects in a separate schema ( public is the default ) you’ll have to create it first:

CREATE SCHEMA SCHEMA1;
CREATE TABLE SCHEMA1.TABLE1 ( A INTEGER );

specify the search path to avoid the schema in your commands:

SHOW search_path;
SET search_path TO schema1,public;

to display privileges either use the psql shortcut:

\z
                                    Access privileges
 Schema  |        Name        | Type  |   Access privileges   | Column access privileges 
---------+--------------------+-------+-----------------------+--------------------------
 public  | pg_stat_statements | view  | sysdba=arwdDxt/sysdba+| 
         |                    |       | =r/sysdba             | 
 schema1 | table1             | table |                       | 
(2 rows)

or query the information schema for a specific object:

SELECT * FROM information_schema.table_privileges WHERE table_name = 'TABLE1';

Client Connections

in postgres there is one file which controls if and how clients connect to the database server. the file is called “pg_hba.conf” and is located in the data area of the database server. initdb automatically creates this file when the cluster is initialized.

in my case the file looks like this:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
local   all             all                                     md5
host    all             all             ::1/128                 md5
local   replication     sysdba                                md5
host    replication     sysdba        127.0.0.1/32            md5

the first column is the type, which can be one of:

  • local: this is for unix domain sockets
  • host: this is for tcp/ip
  • hostssl: this is for ssl over tcp/ip
  • hostnossl: this is for tcp/ip connections which do not use ssl

the second and third columns specifies the database name and user the configuration is valid for. by specifying addresses you can enable individual hosts or networks to connect to the database server. the last column specifies the authentication method, which can be one of:

  • trust: this effectively disables authentication and should not be used
  • reject: rejects all connections which are valid for the entry
  • md5: password authentication using md5
  • password: password authentication ( clear text )
  • ident: use the os to authenticate the user

additionally to the methods above postgres provides support for pam, kerberos, gssapi, sspi for windows, radius and ldap. all for free, in contrast to oracle.

in general one should at least use md5 to provide minimum security. trust and ident should not be used in production environments.

check the documentation for a more detailed description ….

Auditing

for auditing in postgresql you’ll need to create triggers and/or functions. there is no out of the box module which you can use. but you can use several server parameters to log information to the server’s logfile.
A quick check on pgfoundry listed a project called “audittrail” which is still in beta status and the last update was in 2007.
in the end, you’ll have to spend more work on auditing in postgresql than in oracle. this may be a drawback for enterprise installations …

as oracle does, postgresql is controlled by plenty of parameters. not all of them need to be adjusted but some of them are important to understand. so I will setup a new postgresql database with a more detailed view on the parameters one should consider when going live ( I will not go into detail on how to layout the filesystems, the focus is on the parameters ).

initially I will use the same setup as in the first post but adjust the most important parameters.

initial setup:

pg_ctl stop -D /opt/postgres/mydb -m fast
rm -rf /opt/postgres/mydb
initdb -D /opt/postgres/mydb -U sysdba -W
rm -f /opt/postgres/mydb/postgresql.conf

I deleted the sample configuration as I want to specify the most important parameters to fit my needs.

log messages are essential for the dba so one of the first things to do is to define where and how much the database server should log. there are several parameters which control this in postgresql:

parameter description
log_destionation tells the server where to write logs to, can be one of: stderr, syslog, eventlog, cvslog
logging_collector if on, the server will start its own logging process for catching logs from stderr and writing them the a log file
log_directory the directory where the log files should go to
log_filename the filename to use for the server log ( several place holders may be used to specify the format )
log_rotation_age specifies the amount of time before rotating the log file
log_rotation_size specifies the size the log file can reach before rotating the log file
log_truncate_on_rotation if on, rotated log files will be overwritten
client_min_messages controls how many and what messages are returned to the client (DEBUG5-DEBUG1,LOG;NOTICE,WARNING,ERROR,FATAL,PANIC)
log_min_messages controls how many and what messages are written to the log (DEBUG5-DEBUG1,LOG;NOTICE,WARNING,ERROR,FATAL,PANIC)
log_autovacuum_min_duration the time a vacuum opration may consume until it is reported in the logfile
log_error_verbosity control how detailed the output to the log file will be ( terse, default, verbose )
log_min_error_statement additionally reports the statement that produced an error (DEBUG5-DEBUG1,LOG;NOTICE,WARNING,ERROR,FATAL,PANIC)
log_min_durations_statement additionally reports statements which tool longer that specified
log_checkpoints if on, logs checkpoints the server’s log file
log_connections logs each new database connection to the log file
log_disconnections logs each disconnection to the log file
log_duration logs the duration of every sql statement
log_hostname converts ip addresses to hostnames in the log file
log_line_prefix specifies the prefix for each line reported to the log ( various place holders available )
log_lock_waits if on, every process waiting longer than deadlock_timeout for a lock will be reported
log_statement specifies if and which sql statements will be written to the log file ( none, ddl, mod, all )
log_temp_files specifies if log entry will be written each time a temporary file gets deleted
log_timezone specifies the timezone for the log entries

as you can see, the dba is given much more control about logging than in oracle. it clearly depends on the database and application what should be logged. to start, this set should be appropriate:

export PARAMFILE=/opt/postgres/mydb/postgresql.conf
echo "###### logging settings" >> $PARAMFILE
echo "logging_collector=on" >> $PARAMFILE
echo "log_truncate_on_rotation=on" >> $PARAMFILE
echo "log_filename='postgresql-%a.log'" >> $PARAMFILE
echo "log_rotation_age='8d'" >> $PARAMFILE
echo "log_line_prefix='%m - %l - %p - %u@%d '" >> $PARAMFILE
echo "log_directory='/var/log/'" >> $PARAMFILE
echo "log_min_messages='WARNING'" >> $PARAMFILE
echo "log_autovacuum_min_duration=360s" >> $PARAMFILE
echo "log_error_verbosity=default" >> $PARAMFILE
echo "log_min_error_statement=ERROR" >> $PARAMFILE
echo "log_duration_statement=5min" >> $PARAMFILE
echo "log_checkpoints=on" >> $PARAMFILE
echo "log_statement=ddl" >> $PARAMFILE
echo "client_min_messages='WARNING'" >> $PARAMFILE

once having specified the log settings it is time to think about the memory requirements. compared to the oracle settings there are not too much parameters to specify here:

parameter description
shared_buffers controls the amount of shared memory available to the whole database cluster. the initial size on my box is 32M which is rather small.
temp_buffers controls the amount of buffers used for temporary tables _per_ session.
work_mem the amount of memory used for sort and hash operations per operation
maintenance_work_mem the amount of memory used for maintenance operations such as ACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY

although these settings strongly depend on the database and application requirements and the serves hardware this could be a good start:

echo "###### memory settings" >> $PARAMFILE
echo "shared_buffers=256MB" >> $PARAMFILE
echo "temp_buffers=16MB" >> $PARAMFILE
echo "work_mem=4MB" >> $PARAMFILE
echo "maintenance_work_mem=16MB" >> $PARAMFILE

the next point to think about is the wal ( write ahead log ). as the wal files are essential for consistency and a production system never should go without archived logs these settings are critical. postgresql offers various parameters for controlling this ( only the most important here ):

parameter description
fsync should always be on ( default ) as this controls that comitted transactions are guaranteed to be written to disk
wal_buffers size of the wal buffers inside the databases’ shared memory ( comparable to the log_buffer in oracle )
synchronous_commit if off, asynchronous writes to the wal files are enabled ( loss of transactions may occur, but no data inconsistency )
wal_writer_delay the time frame the wal writer process writes blocks to the wal files ( 200ms by default )
checkpoint_segments the amount of checkpoint segments ( typically 16MB each ) available: comparable to oracle’s amount of redo logs
checkpoint_timeout controls the frequency of checkpoints ( 5 seconds by default )
checkpoint_warning controls how frequent checkpoints may occur until a warning to the log will be written
checkpoint_completion_target controls how fast checkpoints should complete ( 0.0 => fastest, 1.0 => slowest, which means the whole period between to checkpoints )
full_page_writes should be on to enable that the whole pages will be written to disk after the first change after a checkpoint.
wal_level controls how much information is written to the wal files: minimal ( crash recovery ), archive ( wal based recovery ), hot_standby ( read only standby )
archive_mode archiving of the wal files: on/off
archive_command any command used to archive the wal files
archive_timeout controls how often wal archived should be saved
hot_standby enables read only standby ( active dataguard in oracle terms )
max_wal_senders controls the amount of standby databases this master can serve
wal_sender_delay controls how often data gets replicated ( default is 200ms )

a reasonable configuration to start with ( standby databases are not in scope here ) could be:

echo "###### wal settings" >> $PARAMFILE
echo "fsync=on" >> $PARAMFILE
echo "wal_buffers=16MB" >> $PARAMFILE
echo "synchronous_commit=on" >> $PARAMFILE
echo "wal_writer_delay=200ms" >> $PARAMFILE
echo "checkpoint_segments=16" >> $PARAMFILE
echo "checkpoint_timeout=300s" >> $PARAMFILE
echo "checkpoint_warning=30s" >> $PARAMFILE
echo "checkpoint_completion_target=0.9" >> $PARAMFILE
echo "full_page_writes=on" >> $PARAMFILE
echo "wal_level=archive" >> $PARAMFILE
echo "archive_mode=on" >> $PARAMFILE
echo "archive_command='test ! -f /opt/postgres/arch/%f && cp %p /opt/postgres/arch/%f'" >> $PARAMFILE
echo "archive_timeout=10min" >> $PARAMFILE

as the vacuum and analyze processes are such important there are parameters to control this ( the most important here ):

parameter description
autovacuum enables the autovaccum process launcher
autovacuum_max_workers controls how many autovacuum processes will be started
autovacuum_naptime controls the minimum delay between vacuum processes ( defaults to 1 minute )

adding them to the server’s parameter file:

echo "###### autovaccum settings" >> $PARAMFILE
echo "autovacuum=on" >> $PARAMFILE
echo "autovacuum_max_workers=3" >> $PARAMFILE
echo "autovacuum_naptime=5min" >> $PARAMFILE

one more parameter to specify is for loading the pg_stat_statements module from the contrib directory:

echo "###### pg_stat_statements" >> $PARAMFILE
echo "shared_preload_libraries='pg_stat_statements'" >> $PARAMFILE

keep in mind that this is only a set to start with, especially if you do not know how the application will behave. there are a bunch of more parameters which give you much more control over various aspects of the database. check the documentation for the complete reference.

once the database server is up and running, high availability might be configured, maintenance scripts are running and backup is in place normal operation starts. what’s still missing is: the procedures and techniques to monitor the database server either for being able to take the necessary actions in case you’ll reach some limits or users are complaining because performance drops.

oracle provides awr/ash reports based on various statistics to help the dba track down any issues and for proper capacity planning. what does postgres provide ? the third post already introduced the analyze command which is used for collecting various statistics about the tables in the database. so, basically, postgres collects statistics as oracle does. the questions is how you can control it, and how you can use the various statistics to build reports that can help you in identifying things of interest?

what, and how much, oracle is collecting is controlled by the parameter statistics_level. in postgres there are several parameters you can specify to control the gathering behaviour. the three most important are:

  • track_counts: enables collection of database wide statistics ( on by default )
  • track_function: enables collection of usage of user defined functions ( off be default )
  • track_activities: enables collection of statistics of each command currently executed by any server process/session ( on by default )

the database uses a separate directory for storing the statistics in temporary files and to provide the statistics to other processes:

psql -U sysdba -d postgres -c "show stats_temp_directory;" stats_temp_directory 
----------------------
 pg_stat_tmp
(1 row)

of course you may and probably should change this to a fast filesystem ( the documentation recommends a RAM based filesystem ). once the server shuts down these statistics will get copied to the global directory to make them permanent. this procedure is slightly different from the oracle approach where the statistics are stored in the data dictionary ( that is the system/sysaux tablespace ).

as oracle does, postgres provides a wide range of views to query the statistics of various areas of the database. the complete list can by found in the documentation.

what impresses me from an oracle point of view are the various functions which can be used to quickly check various statistics. for example, to query the number of transactions comitted in the database you can simple do this:

SELECT pg_stat_get_db_xact_commit(d.oid)
  FROM pg_database d
 WHERE d.datname = 'postgres';

… which will report the commited transactions of the database postgres. in combination which psql’s ability to define aliases this can be a very powerful feature. to give you an example this will define a new alias in psql’s control file:

echo "\\set mypgstats 'SELECT pg_stat_get_db_xact_commit(d.oid) FROM pg_database d WHERE d.datname = \'postgres\';'" >> ~/.psqlrc

when you restart psql you can reference the alias:

sysdba@[local]:5432/postgres*# :mypgstats 
 pg_stat_get_db_xact_commit 
----------------------------
                       2965
(1 row)

… and what is even better ( at least on linux ) : tab completion works, too. even for the aliases. a feature i’d definitely like to see in oracle.

the documentation list a nice example to query the process ids with the commands they are executing:

SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,
       pg_stat_get_backend_activity(s.backendid) AS current_query
    FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
 procpid |                           current_query                           
---------+-------------------------------------------------------------------
    4107 | 
    4581 | SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,          +
         |        pg_stat_get_backend_activity(s.backendid) AS current_query+
         |     FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
(2 rows)

very easy…impressed.

as mentioned above, according to the documentation these statistics will be stored permanently once the database server stops. I wondered if the counters will get re-setted or are ever increasing:

[postgres@postgres ~]$ psql -U sysdba -d postgres -c "SELECT pg_stat_get_db_xact_commit(d.oid) FROM pg_database d WHERE d.datname = 'postgres';"
 pg_stat_get_db_xact_commit 
----------------------------
                       2992
(1 row)

[postgres@postgres ~]$ pgstop
waiting for server to shut down..... done
server stopped
[postgres@postgres ~]$ pgstart
server starting
[postgres@postgres ~]$ psql -U sysdba -d postgres -c "SELECT pg_stat_get_db_xact_commit(d.oid) FROM pg_database d WHERE d.datname = 'postgres';"
 pg_stat_get_db_xact_commit 
----------------------------
                       2994

the values are ever increasing. so, for getting meaningful values you’ll need to store some of the values including the time and compare it to future values of the same statistics. to reset the counters pg_stat_reset is used:

SELECT pg_stat_reset();

I didn’t figure out if there is anything comparable to the oracle awr snapshots of which one may generate reports ( any hints on that are welcome ).

beside the views and functions which provide the statistics there is one more nice system table which reports on all the locks in the database:

sysdba@[local]:5432/postgres# select * from pg_locks;
  locktype  | database | relation |   page   |  tuple   | virtualxid | transactionid | classid  |  objid   | objsubid | virtualtransaction | pid  |      mode       | granted 
------------+----------+----------+----------+----------+------------+---------------+----------+----------+----------+--------------------+------+-----------------+---------
 relation   |    12780 |    11000 | <> | <> | <>   |      <> | <> | <> | <> | 3/21               | 4639 | AccessShareLock | t
 virtualxid | <> | <> | <> | <> | 3/21       |      <> | <> | <> | <> | 3/21               | 4639 | ExclusiveLock   | t
(2 rows)

… a nice and simple overview which can help in identifying causes of issues.

in linux as well as most of the unixes you may use top to query the current processes running on the system and their resource consumptions. there is a seperate project on the postgres website called ptop which provides a similar functionality for the postgres database sessions. sample output:

last pid:  1245;  load avg:  1.60,  1.29,  0.52;       up 0+00:02:25                   08:18:52
2 processes: 1 running, 1 sleeping
CPU states:  1.0% user,  0.0% nice,  1.8% system, 93.3% idle,  4.0% iowait
Memory: 229M used, 768M free, 52M buffers, 79M cached
Swap: 2000M free
  PID USERNAME PRI NICE  SIZE   RES STATE   TIME   WCPU    CPU COMMAND
 1246 lcspostg  20    0  461M 5356K run     0:00  0.02%  0.20% postgres: sysdba dbs200 [local] 
 1245 lcspostg  20    0  461M 4104K sleep   0:00  0.02%  0.20% postgres: sysdba postgres [local

ptop might not be available in your postgres distribution but it can easily be compiled from the contrib section of the source distribution. some vendors provide packaged versions as well.

for daily operations there is a nice plugin for nagios which saves you a lot of work and automates much of the daily tasks.

another extension one might consider is pg_stat_statements. it is available in the contrib directory of the source code distribution, too. once compiled and installed it provides an overview of the statements executed in the system:

select * from pg_stat_statements;
userid | dbid  |                                                                                                              query                                                                    
                                          | calls | total_time | rows | shared_blks_hit | shared_blks_read | shared_blks_written | local_blks_hit | local_blks_read | local_blks_written | temp_blks_rea
d | temp_blks_written 
--------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------+-------+------------+------+-----------------+------------------+---------------------+----------------+-----------------+--------------------+--------------
--+-------------------
     10 | 12780 | SELECT a.attname,                                                                                                                                                                     
                                         +|     1 |   0.029022 |   13 |              34 |                4 |                   0 |              0 |               0 |                  0 |              
0 |                 0

more to come about the parser, rewriter and planner

this is the summary of the postgresql introduction posts available so far:

1. installation and basics
2. database creation
3. maintenance tasks
4. backup and restore
5. high availability
6. monitoring and troubleshooting basics
7. a rather more productive configuration
8. privileges and connections

… more to come soon

you did your dataguard setup and everything seems to be configured correct. the broker configuration is fine, you can cross connect to each instance as sysdba but when performing a switchover through dgmgrl you receive:

ORA-01017 : invalid username/password; logon denied

… and you even provided the sys password while connecting with dgmgrl:

dgmgrl sys/password@somedatabase

what went wrong ? just had this case and it took me an hour to figure it out: the sys password contained special characters. what a …. maybe this is mentioned somewhere in the documentation.

a quick way to test it:

if:

sqlplus sys/"somepasswordwithspecialcharacters"@somedb as sysdba

works, but the same connect without the quotes does not:

sqlplus sys/somepasswordwithspecialcharacters@somedb as sysdba

… try to change the password to anything without special characters and you will succeed ( remember to change the password for the password file, too ) ….

the last post introduced how postgres implements wal archiving and how the archived logs can be used to recover the database. pretty equal to what oracle does. what about replicating these logs to another server and implement some sort of dataguard ? in postgres terms this is called: log shipping. let’s see how it works.

creating a hot standby server ( active dataguard in oracle words )

as oracle does, postgres provides two different approaches for standby servers:

  • transfer the whole wal files one by one to the standby server, which is asynchronous and may lead to data loss. this is called file based log shipping.
  • record based log shipping, which sends the wal records incrementally to the standby server ( streaming )

before deciding to implement one of the above solutions carefully think about the limitations:

  • all database servers involved in the configuration must be on the same architecture: log shipping from 32bit system to 64bit ( and vice versa ) is not possible
  • all databases involved in the configuration must use the same mountpoints. there is not feature like the various *_name_convert parameters as in oracle
  • all databases involved in the configuration must use the same major version of postgres. minor version differences might work but are not officially supported
  • if you want to upgrade to a new minor release, be sure to upgrade the standby database first
  • postgres requires the destination of the archived logs available on the master and standby server ( at least if you do not want downtime of the primary database )

to give you an idea on what’s happening, this is from the official documentation: “In standby mode, the server continuously applies WAL received from the master server. The standby server can read WAL from a WAL archive (see restore_command) or directly from the master over a TCP connection (streaming replication). The standby server will also attempt to restore any WAL found in the standby cluster’s pg_xlog directory. That typically happens after a server restart, when the standby replays again WAL that was streamed from the master before the restart, but you can also manually copy files to pg_xlog at any time to have them replayed.

At startup, the standby begins by restoring all WAL available in the archive location, calling restore_command. Once it reaches the end of WAL available there and restore_command fails, it tries to restore any WAL available in the pg_xlog directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or pg_xlog. If that fails or streaming replication is not configured, or if the connection is later disconnected, the standby goes back to step 1 and tries to restore the file from the archive again. This loop of retries from the archive, pg_xlog, and via streaming replication goes on until the server is stopped or failover is triggered by a trigger file.”

to avoid any connection issues between the master and the standby it may be a goog idea to disable the firewall on both nodes:

service iptables stop
chkconfig iptables off

for getting the extra tools to manage standby databases I additionally installed the contrib package:

yum install postgresql91-contrib

for me, the fastest way to get a standby server was to clone my virtual box vm I used during the previous post. this gave me exactly the same software and os. all steps that follow assume that both nodes are up and running and can connect to each other.

my primary node:
nodename: postgres
ip: 192.168.56.105

my slave node:
nodename: postgres2
ip: 192.168.56.106

as my vm was cloned, I stopped and deleted the database which was running on the clone:

pg_ctl stop -D /opt/postgres/mydb -m fast
[postgres@postgres2 ~]$ rm -rf /opt/postgres/mydb/*
[postgres@postgres2 ~]$ rm -rf /opt/postgres/backups/*

additionally I set up password less ssh connectivity for the postgres user between the nodes:

node postgres:

[postgres@postgres ~]$ ssh-keygen -t rsa
[postgres@postgres ~]$ ssh-copy-id -i .ssh/id_rsa.pub postgres@postgres2

node postgres2:

[postgres@postgres2 ~]$ ssh-keygen -t rsa
[postgres@postgres2 ~]$ ssh-copy-id -i .ssh/id_rsa.pub postgres@postgres2

for simulating a real life scenario I will assume the the primary database is already configured for archiving ( in real world one would archive to a network location accessible to all nodes ). to prepare this on the master:

echo "port=5432
max_connections = 100
shared_buffers = 32MB
log_destination = 'stderr'
logging_collector = on
log_directory = '/opt/postgres/log/'
log_filename = 'postgresql-%a.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
temp_tablespaces='temp1,temp2'
wal_level=hot_standby
archive_mode=on
archive_command = 'rsync -a %p postgres@postgres2:/opt/postgres/mydb/arch/%f'
max_wal_senders=5
listen_addresses='*'" > /opt/postgres/mydb/postgresql.conf
pg_ctl stop -D /opt/postgres/mydb -m fast
pg_ctl start -D /opt/postgres/mydb -l /opt/postgres/log/mydb_pgstartup.log

the parameter in question is the archive_command. if you need more information on this parameter check the documentation or the previous post. the reason for posting the whole configuration is, that the steps should be repeatable.

to allow the standby database to connect to the master, add the following line to your masters pg_hba.conf file:

echo "host    replication     all            192.168.56.106/32      md5" >> /opt/postgres/mydb/pg_hba.conf

for enabling this we need to restart the database ( there is no need to restart if archiving is already enabled and you just need to change the parameter. then a reload is sufficient ). if everything works as expected this will be the only restart. for real life scenarios this means: if the master database has wal archiving enabled you do not need to restart the database, thus no downtime will be necessary. if archiving is not enabled, what hopefully is not the case for any production database, you will need to request a short downtime for enabling it.

ok, this is the starting point: primary database running in archive mode, nothing available on the standby server except the archived wal files from the master:

[postgres@postgres2 ~]$ ls -la /opt/postgres/mydb/arch/
total 32776
drwxrwxr-x. 2 postgres postgres     4096 Jun 14 14:25 .
drwxr-xr-x. 6 postgres postgres     4096 Jun 13 13:29 ..
-rw-------. 1 postgres postgres 16777216 Jun 14 14:14 000000030000000000000041
-rw-------. 1 postgres postgres 16777216 Jun 14 14:23 000000030000000000000042
[postgres@postgres2 ~]$ 

now we are ready to clone the master database:

# on the master start the backup and copy the files to the standby database
psql -U sysdba -d postgres -c "SELECT pg_start_backup('backup');"
scp -pr /opt/postgres/mydb/* postgres@postgres2:/opt/postgres/mydb/
# enable recovery and start the standby database
echo "restore_command = 'cp -f /opt/postgres/mydb/arch/%f %p </dev/null'" > /opt/postgres/mydb/recovery.conf
echo "standby_mode = 'on'" >> /opt/postgres/mydb/recovery.conf
echo "primary_conninfo = 'host=192.168.56.105 user=sysdba password=admin123'" >> /opt/postgres/mydb/recovery.conf
echo "hot_standby=on" >> /opt/postgres/mydb/postgresql.conf
rm -f /opt/postgres/mydb/postmaster.pid
# start the standby database
pg_ctl start -D /opt/postgres/mydb -l /opt/postgres/log/mydb_pgstartup.log
# end backup on the primary database
psql -U sysdba -d postgres -c "SELECT pg_stop_backup();"

if everything went fine you should see log-messages in the standby’s logfile like this:

LOG:  streaming replication successfully connected to primary
LOG:  consistent recovery state reached at 0/55000000
LOG:  database system is ready to accept read only connections

you may double check if the standby is really in recovery mode by:

sysdba@[local]:5432/postgres# select * from pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 t
(1 row)
sysdba@[local]:5432/postgres*# 

of course you want to failover if something happens to your primary database. in postgres you do this by specifying the trigger_file parameter in the recovery.conf file:

echo "trigger_file='/opt/postgres/mydb/failover'" >> /opt/postgres/mydb/recovery.conf

if you need to failover just touch this file. the standby database will notice this and initiate the failover ( of course after applying all the necessary log files ):

touch /opt/postgres/mydb/failover

… which will result in:

LOG:  trigger file found: /opt/postgres/mydb/failover
FATAL:  terminating walreceiver process due to administrator command
...
LOG:  archive recovery complete
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

to further verify if the replication is really working create a table on the master, insert a row and check if this data is available on the standby afterwards:

# primary
psql -U sysdba -d db1
sysdba@[local]:5432/db1# create table t1 ( a int ) ;
sysdba@[local]:5432/db1*# insert into t1 (a) values (1);
INSERT 0 1
sysdba@[local]:5432/db1*# commit;
COMMIT
# check on the standby
psql -d db1 -U sysdba
\d+ t1
                  Table "public.t1"
 Column |  Type   | Modifiers | Storage | Description 
--------+---------+-----------+---------+-------------
 a      | integer |           | plain   | 
Has OIDs: no
sysdba@[local]:5432/db1# select * from t1;
 a 
---
 1
(1 row)

this are the basic steps to create a hot standby with postgres. although postgresql does not have a tool like rman, cloning a database is not a big deal, even if the source database is up and running.

of course you’ll need to carefully review all the parameters and test the setup before going to production.

as a dba one of the most important tasks is to restore the database in case of a crash, total lost of a datacenter or even a user error. so every serious database must provide tools and techniques to backup and restore the system. postgresql provides different tools for different requirements:

cold backups

of course one can perform cold backups with requires the database to be down. a simple tar of the data area and the backup is done. can be easily scripted and implemented. may be a solution for development or even test but nor for production databases. a slightly different approach is to use rsync while the database is running, shutdown the database once rsync finished, do another rsync ( which will require less time ) and start up the database. this will reduce downtime, but still can not be the solution for production databases.

snapshots

of course you can use snapshot technologies provided by the storage vendor or the filesystem ( if you have enabled wal archiving ).

pg_dump

one command to backup a postgres database is pg_dump. this command can be compared to exp/expdp utilities of the oracle database. the result is a sql file which contains all the statements and data to rebuild the database up to the time the dump was taken.

dumping the database is straight forward:

pg_dump -U sysdba template1 > /opt/postgres/backups/dump1.dmp

as I did not create anything in the template1 database the dump is rather small:

cat /opt/postgres/backups/dump1.dmp 
--
-- PostgreSQL database dump
--
SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
--
-- Name: template1; Type: COMMENT; Schema: -; Owner: sysdba
--
COMMENT ON DATABASE template1 IS 'default template for new databases';
--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: 
--
CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: 
--
COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
--
-- Name: public; Type: ACL; Schema: -; Owner: sysdba
--
REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM sysdba;
GRANT ALL ON SCHEMA public TO sysdba;
GRANT ALL ON SCHEMA public TO PUBLIC;
--
-- PostgreSQL database dump complete
--

… but you can see that the dump file is plain text.

what really surprised my are the various options you can give to the pg_dump utility. compression, for example, is a feature you have to pay for if you want to use it with expdp. it’s free with postgres:

pg_dump -U sysdba -Z 9 template1 > /opt/postgres/backups/dump1.dmp

of course the dump file will be binary in this case. in general, all you’ll need is there:

Options controlling the output content:
  -a, --data-only             dump only the data, not the schema
  -b, --blobs                 include large objects in dump
  -c, --clean                 clean (drop) database objects before recreating
  -C, --create                include commands to create database in dump
  -E, --encoding=ENCODING     dump the data in encoding ENCODING
  -n, --schema=SCHEMA         dump the named schema(s) only
  -N, --exclude-schema=SCHEMA do NOT dump the named schema(s)
  -o, --oids                  include OIDs in dump
  -O, --no-owner              skip restoration of object ownership in
                              plain-text format
  -s, --schema-only           dump only the schema, no data
  -S, --superuser=NAME        superuser user name to use in plain-text format
  -t, --table=TABLE           dump the named table(s) only
  -T, --exclude-table=TABLE   do NOT dump the named table(s)
  -x, --no-privileges         do not dump privileges (grant/revoke)
  --binary-upgrade            for use by upgrade utilities only
  --column-inserts            dump data as INSERT commands with column names
  --disable-dollar-quoting    disable dollar quoting, use SQL standard quoting
  --disable-triggers          disable triggers during data-only restore
  --inserts                   dump data as INSERT commands, rather than COPY
  --no-security-labels        do not dump security label assignments
  --no-tablespaces            do not dump tablespace assignments
  --no-unlogged-table-data    do not dump unlogged table data
  --quote-all-identifiers     quote all identifiers, even if not key words
  --serializable-deferrable   wait until the dump can run without anomalies
  --use-set-session-authorization
                              use SET SESSION AUTHORIZATION commands instead of
                              ALTER OWNER commands to set ownership

what is missing, from an oracle perspective, is some sort of parallel option and encryption.

to restore the dump file, one uses:

psql -U sysdba db1 < /opt/postgres/backups/dump1.dmp

things to consider:

  • this will not create the database itself, so create the database from template0 before starting the import: createdb -T template0 db1
  • all users must exist before importing the dump

just to mention it: there are two other formats you can use with pg_dump:

  • the tar format: pg_dump -U sysdba -Ft db1 > filename.tar
  • the custom format: pg_dump -U sysdba -Fc db1 > filename.dmp

for both of these commands you have to use pg_restore instead of psql:

pg_restore -U sysdba -d dbname filename.dmp

when choosing one of the custom formats this enables you to create a table of contents of the dumpfiles:

pg_restore -l filename.tar > filename.toc
cat filename.toc
;
; Archive created at Thu Jun 21 13:34:31 2012
;     dbname: db1
;     TOC Entries: 8
;     Compression: 0
;     Dump Version: 1.12-0
;     Format: TAR
;     Integer: 4 bytes
;     Offset: 8 bytes
;     Dumped from database version: 9.1.4
;     Dumped by pg_dump version: 9.1.4
;
;
; Selected TOC Entries:
;
2710; 1262 16384 DATABASE - db1 sysdba
5; 2615 2200 SCHEMA - public sysdba
2711; 0 0 COMMENT - SCHEMA public sysdba
2712; 0 0 ACL - public sysdba
161; 3079 12506 EXTENSION - plpgsql 
2713; 0 0 COMMENT - EXTENSION plpgsql 

if you’d like to do a partial restore you can now edit the file and comment the objects you do not want to get created, pass the adjusted table of contents to pg_restore:

pg_restore -U sysdba -d db1 -L filename.toc filename.dmp

… and you’re done. keep in mind that you may destroy dependencies when commenting objects.

pg_dumpall

another utility to perform dump is: pg_dumpall. in contrast to pg_dump, pg_dumpall exports roles or tablespaces, too. so, if you want to export all databases:

pg_dumpall -U sysdba > /opt/postgres/backups/dump1.dmp

… and to restore the dump:

psql -U sysdba -f /opt/postgres/backups/dump1.dmp db1

what do we have until now ? we can create and restore dumps of a single database ( and some subsets ) what provides some sort of backup. for a production system this will not be enough as data will come into the database and gets modified every moment. shutting down the database and do a filesystem backup is possible, too, but will not solve the issue with backups of production databases. in oracle you can enable archiving to force the database to archive the redo logs. how does postgres handle this ?

wal archiving

postgres writes every change to the database to its logfiles ( as oracle does ) which are located in the pg_xlog directory in the data area. as with oracle, these logs are primarily used to replay the changes when the database comes up after a crash occurred.

the procedure to do online backups with postgres is to combine file system backups with wal ( write ahead archiving ).

to enable wal you obviously need to set some parameters and do some preparation. the first parameter you need to set is: wal_level:

echo "wal_level=archive" >> /opt/postgres/mydb/postgresql.conf

next one needs to enable archiving:

echo "archive_mode=on" >> /opt/postgres/mydb/postgresql.conf

nothing special so far if you know oracle dataguard. the interesting parameter is the third one: archive_command. this parameter specifies the command to use for archiving the logfiles ( or archiveable wal segments, in postgres terms ). this mean, postgres gives you the full flexibility on how to actually do the archiving. for example, you can specify:

archive_command = 'test ! -f /opt/postgres/arch/%f && cp %p /opt/postgres/arch/%f'

or if you want to call a custom script:

archive_command = 'my_allinone_archiving_script'

let’s see if it works. restart the database and check the startup file for any errors:

pgstop
pgstart
cat /opt/postgres/log/mydb_pgstartup.log

seems to work, at least no errors reported during the startup. but does the archiving work? :

pg
sysdba@[local]:5432/postgres# select * from pg_switch_xlog();
 pg_switch_xlog 
----------------
 0/1B7D5E8
(1 row)

if everything works we should now be able to locate the archived log:

ls -la /opt/postgres/arch/
total 49160
drwxrwxr-x. 2 postgres postgres     4096 Jun 13 14:18 .
drwxr-xr-x. 6 postgres postgres     4096 Jun 13 13:29 ..
-rw-------. 1 postgres postgres 16777216 Jun 13 14:17 000000010000000000000001
-rw-------. 1 postgres postgres 16777216 Jun 13 14:17 000000010000000000000002
-rw-------. 1 postgres postgres 16777216 Jun 13 14:18 000000010000000000000003

seems to work. what else needs to be done? we need a level 0 backup, right ?

connect to postgres and start the backup by calling the pg_start_backup function and provide a label:

pg
SELECT pg_start_backup('my_level_0_backup',true);

this will create a backup label file in the data area:

cat /opt/postgres/mydb/backup_label 
START WAL LOCATION: 0/5000020 (file 000000010000000000000005)
CHECKPOINT LOCATION: 0/5000020
BACKUP METHOD: pg_start_backup
START TIME: 2012-06-13 14:23:19 CEST
LABEL: my_level_0_backup

now you can backup all the datafiles with a method of your choice, for example:

tar -cvf my_level_0_backup.ta /opt/postgres/mydb/*

reconnect to postgres and stop the backup mode:

pg
SELECT pg_stop_backup();
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup 
----------------
 0/60000A0
(1 row)

you’re done when the file reported by the pg_stop_backup arrived at the defined location. from now on it is essential that you place the archived files to a save location in case you need to apply them to the base backup in case of recovery.
in oracle syntax this procedure is very similar to “alter database begin backup”, “alter database end backup”…..

if you’d like to save some work there is another command called: pg_basebackup. this tool will do all the work for you: starting the backup, copying the files, ending the backup. check that you have an entry like the following in your pg_hba.conf-file:

host    replication     replication             all                     md5 

… and that the parameter max_wal_senders is high enough:

echo "max_wal_senders=5" >> /opt/postgres/mydb/postgresql.conf

if ok, you are ready:

pg_basebackup -U sysdba -D /opt/postgres/backups/ -Ft -x -z
ls -la /opt/postgres/backups/
total 6444
drwxrwxr-x. 2 postgres postgres    4096 Jun 13 14:45 .
drwxr-xr-x. 6 postgres postgres    4096 Jun 13 13:29 ..
-rw-rw-r--. 1 postgres postgres  733018 Jun 13 14:45 16384.tar.gz
-rw-rw-r--. 1 postgres postgres     121 Jun 13 14:45 16389.tar.gz
-rw-rw-r--. 1 postgres postgres     119 Jun 13 14:45 16391.tar.gz
-rw-rw-r--. 1 postgres postgres 5845363 Jun 13 14:45 base.tar.gz

this includes all the files in data area directory, including configuration files, tablespaces and third party files. so this backup could be extracted and you could start the server directly.

a backup without a proof that the restore is actually possible is not worth anything. let’s try the restore:

restore using the archive backup(s)

the first thing you need to do, obviously, is to shutdown the postgres server if it is running and to make a backup of all the files in data area ( just in case you’ll need any of the configuration data files later ).

pgstop
tar -cvf /opt/postgres/backups/current_state.tar /opt/postgres/mydb/*
# if you are short on space, compress the tar archive
gzip /opt/postgres/backups/current_state.tar

next, cleanup the current data area by deleting all the files and directories ( remember to cleanup any tablespace directories in case they are located elsewhere ) :

rm -rf /opt/postgres/mydb/*

time to restore. locate your backup and extract the files to the data area:

cp /opt/postgres/backups/16* /opt/postgres/backups/base.tar.gz /opt/postgres/mydb/
cd /opt/postgres/mydb/
gunzip *
tar -xvf base.tar
tar -xvf 16391.tar
tar -xvf 16389.tar
tar -xvf 16384.tar
rm -f *.tar

delete the archived wal files currently present in pg_xlog/ directory as they might not be current:

rm -f /opt/postgres/mydb/pg_xlog/*

as with the archive_command above you’ll need to define the restore_command parameter ( recovery settings are maintained in a separate file called recovery.conf ):

echo "restore_command='cp /opt/postgres/arch/%f \"%p\"'" >> /opt/postgres/mydb/recovery.conf

ok, let’s start the server. postgres will start in recovery mode. tail your server logfile to see what’s happening:

pgstart

the logfile reports:

LOG:  database system was interrupted; last known up at 2012-06-13 14:45:08 CEST
LOG:  creating missing WAL directory "pg_xlog/archive_status"
LOG:  starting archive recovery
LOG:  restored log file "00000001000000000000000A" from archive
LOG:  redo starts at 0/A000078
LOG:  consistent recovery state reached at 0/B000000
LOG:  restored log file "00000001000000000000000B" from archive
...
LOG:  archive recovery complete
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

seems the revocery worked. can we connect to the database again ? :

pg
select * from pg_database;

allright. database restored, work done. if you want to do a point in time recovery you’ll need to add one of the recovery target parameters to your recovery.conf file.

one more thing to mention: every time you recover the database from the archived wals the database creates a new time line. you can think of it as the oracle incarnations. if you check your location where the databases places the archived files you will notice a small history file which describes the new time line:

cat /opt/postgres/arch/00000002.history
1	00000001000000000000000B	no recovery target specified

additionally the wal files which get archived from now an, will have an increased number to reflect the time line:

pg
select * from pg_switch_xlog();
 pg_switch_xlog 
----------------
 0/C000098
(1 row)

after the logswitch you can confirm the new timeline by checking the filename of the archived wal:

ls -ltra /opt/postgres/arch/
-rw-------. 1 postgres postgres 16777216 Jun 13 15:13 00000001000000000000000B
-rw-------. 1 postgres postgres       56 Jun 13 15:41 00000002.history
-rw-------. 1 postgres postgres 16777216 Jun 13 15:53 00000002000000000000000C

the timeline id increased from 1 to 2.

conclusion: postgres provides the tools which help you in establishing your backup and restore processes. once you are familiar with the tools, backing up and restoring the database is easy and robust. of course you’ll need to test the procedures before going to production and regular restore tests are a must.

do you think of dataguard ? when there is archiving there should be a possibility to apply these archives to another database …

as with oracle there are some maintenance tasks once should perform with the postgres database. while the first and second posts introduced the essentials for setting up the server and creating an initial database this post will look at the maintenance of the postgres database.

one of the most important maintenance task in postgres is: vacuuming. as described in the second post the vacuum launcher processes is responsible for starting the vacuum worker processes for the databases. another important task is collecting statistics.

these two tasks are done by the vacuum and analyze commands, where:

  • analyze: is used to collect statistics about a database ( collects statistics of the tables in the database and stores them in the pg_statistics catalog ).
  • vacuum: is used to reclaim storage used by dead tuples ( for example produced by deletes )

let’s take a look at two commands.

analyze

in oracle you would use the dbms_stats package to gather statistics. in postgres it is the analyze command. a description of all the options, as usual, can be found in the documentation.

without any arguments the analyze command will collect the statistics for every table in the database:

sysdba@[local]:5432/postgres*# analyze;
ANALYZE
sysdba@[local]:5432/postgres*#

a nice argument, especially if you are new to postgres, is the verbose argument. when executed with this argument analyze reports the tables which are analyzed. this gives you some hints on the internal tables ( catalog ):

sysdba@[local]:5432/postgres*# analyze verbose;
INFO: analyzing "pg_catalog.pg_type"
INFO: "pg_type": scanned 7 of 7 pages, containing 309 live rows and 2 dead rows; 309 rows in sample, 309 estimated total rows
INFO: analyzing "pg_catalog.pg_attribute"
INFO: "pg_attribute": scanned 40 of 40 pages, containing 2155 live rows and 30 dead rows; 2155 rows in sample, 2155 estimated total rows
INFO: analyzing "pg_catalog.pg_authid"
INFO: "pg_authid": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
INFO: analyzing "pg_catalog.pg_proc"

additionally analyze reports various statistics collected for each table. this is a feature I’d like to see in oracle, too. especially when called automatically this information might be useful if you quickly need to compare statistics. it is not a big deal to save this information to a logfile and to keep some of them for every run of analyze.

there are two more parameters analyze accepts. of course you can specify a single table to collect statistics for:

sysdba@[local]:5432/postgres# analyze verbose pg_catalog.pg_am;
INFO: analyzing "pg_catalog.pg_am"
INFO: "pg_am": scanned 1 of 1 pages, containing 4 live rows and 0 dead rows; 4 rows in sample, 4 estimated total rows
ANALYZE

… and you can tell analyze to collect statistics for a single column:

analyze verbose pg_catalog.pg_am(amname);
INFO: analyzing "pg_catalog.pg_am"
INFO: "pg_am": scanned 1 of 1 pages, containing 4 live rows and 0 dead rows; 4 rows in sample, 4 estimated total rows
ANALYZE

as collecting statistics on large tables requires a lot of resources and some amount of time postgres uses samples of these tables as oracle does. the size of the samples is controlled by the default_statistics_target parameter which defaults to 100. in contrast to oracle this is not a percentage, but: “The target value sets the maximum number of entries in the most-common-value list and the maximum number of bins in the histogram”.

as with oracle you can override the system wide setting of the default_statistics_target by setting statistics on the columns level…

alter table pg_catalog.pg_am alter column amname set statistics 99;
ALTER TABLE
sysdba@[local]:5432/postgres*#

… and directly set statistical values of columns ( only n_distinct and n_distinct_inherited at the moment ):

alter table pg_catalog.pg_am alter column amname set ( n_distinct = 5 );
ERROR: permission denied: "pg_am" is a system catalog

seems postgres does not allow to directly update catalog tables ( at least in the default configuration ). might be good idea :)

vacuum

as mentioned above vacuum reclaims wasted/unused space. in oracle one would use rebuilding or shrinking. without any parameter vacuum will process any table in the database the user who executes the command has access to:

sysdba@[local]:5432/postgres# vacuum pg_catalog.pg_am;
VACUUM
sysdba@[local]:5432/postgres#

as with the analyze command there is the verbose option:

vacuum (verbose) pg_catalog.pg_am;
INFO: vacuuming "pg_catalog.pg_am"
INFO: index "pg_am_name_index" now contains 4 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_am_oid_index" now contains 4 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_am": found 0 removable, 4 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
sysdba@[local]:5432/postgres#

… which gives you some interesting information of what is really done.

another argument to vacuum is analyze, so you may combine vacuuimg and analyzing ( here on a single table rather than the whole database )…

vacuum (verbose,analyze) pg_catalog.pg_am;
INFO: vacuuming "pg_catalog.pg_am"
INFO: index "pg_am_name_index" now contains 4 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_am_oid_index" now contains 4 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_am": found 0 removable, 4 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "pg_catalog.pg_am"
INFO: "pg_am": scanned 1 of 1 pages, containing 4 live rows and 0 dead rows; 4 rows in sample, 4 estimated total rows
VACUUM

… which is handy for automation.

without the “full” parameter vacuum does not lock the table and simple reclaims space for reuse. if you want to give back space to the operating system ( by actually using less space for the database files ) you will need to call vacuum with the full parameter:

vacuum (verbose,analyze,full);
INFO: vacuuming "pg_catalog.pg_statistic"
INFO: "pg_statistic": found 29 removable, 370 nonremovable row versions in 15 pages
DETAIL: 0 dead row versions cannot be removed yet.
...

… but be careful: this will put an exclusive lock on the table currently processed and temporarily will use more space as the table gets copied. you should not do this while the database is accessed in production.

the last parameter, called “freeze”, leads to a specific design decision with postgres: the internal transaction IDs are limited by 32bits. because of this transaction id wraparounds occur which could lead to data loss ( transaction ids higher as the current are considered future and should not be visible to users, so if the wraparound occurs this could be disastrous ). in postgres there is a special transaction id called: FrozenXID. these transactions ids are always considered older than the every other transaction id. the assignment of the FrozenXID is done by the vacuum command. check the documentation for a detailed description of this behaviour.

to control this behaviour there is a parameter called: vacuum_freeze_min_age ( specified in number of transactions ), which controls how old a transaction (xid) may be until it is replaced with the FrozenXID. thus, a larger value will preserve transactional information longer, while a smaller value will increase the number of transactions before a table must be vacuumed. the other parameter involved is called: vacuum_freeze_table_age

calling the vacuum command with the freeze parameter is equivalent to setting the vacuum_freeze_min_age to zero.

as the vacuum command can be very I/O intensive there might be side effects to current active users of the database. to minimize this effect the administrator can specify the vacuum_cost_delay parameter which will put the vacuum process to sleep if it reaches a defined cost limit ( vacuum_cost_limit ).

because these tasks are such important the autovacuum feature should be enabled which is controlled by the autovacuum and track_counts parameters.
as this is only an introduction I will not mention the other parameters which are used to control the vacuum behaviour.

so, as it is with oracle: it is very important to carefully define your statistic collecting strategy for the best performance. when it comes to postgres you need to think about the vacuum process, too. as postgres keeps old versions of data in the tables data files ( no undo like in oracle ) and the transaction ID will wrap around it is essential to be familiar with the parameters which control this process.

the next post will give an introduction to backup and restore…