why not take a look at postgresql ? (7): are rather more productive configuration

July 25, 2012 — Leave a comment

as oracle does, postgresql is controlled by plenty of parameters. not all of them need to be adjusted but some of them are important to understand. so I will setup a new postgresql database with a more detailed view on the parameters one should consider when going live ( I will not go into detail on how to layout the filesystems, the focus is on the parameters ).

initially I will use the same setup as in the first post but adjust the most important parameters.

initial setup:

pg_ctl stop -D /opt/postgres/mydb -m fast
rm -rf /opt/postgres/mydb
initdb -D /opt/postgres/mydb -U sysdba -W
rm -f /opt/postgres/mydb/postgresql.conf

I deleted the sample configuration as I want to specify the most important parameters to fit my needs.

log messages are essential for the dba so one of the first things to do is to define where and how much the database server should log. there are several parameters which control this in postgresql:

parameter description
log_destionation tells the server where to write logs to, can be one of: stderr, syslog, eventlog, cvslog
logging_collector if on, the server will start its own logging process for catching logs from stderr and writing them the a log file
log_directory the directory where the log files should go to
log_filename the filename to use for the server log ( several place holders may be used to specify the format )
log_rotation_age specifies the amount of time before rotating the log file
log_rotation_size specifies the size the log file can reach before rotating the log file
log_truncate_on_rotation if on, rotated log files will be overwritten
client_min_messages controls how many and what messages are returned to the client (DEBUG5-DEBUG1,LOG;NOTICE,WARNING,ERROR,FATAL,PANIC)
log_min_messages controls how many and what messages are written to the log (DEBUG5-DEBUG1,LOG;NOTICE,WARNING,ERROR,FATAL,PANIC)
log_autovacuum_min_duration the time a vacuum opration may consume until it is reported in the logfile
log_error_verbosity control how detailed the output to the log file will be ( terse, default, verbose )
log_min_error_statement additionally reports the statement that produced an error (DEBUG5-DEBUG1,LOG;NOTICE,WARNING,ERROR,FATAL,PANIC)
log_min_durations_statement additionally reports statements which tool longer that specified
log_checkpoints if on, logs checkpoints the server’s log file
log_connections logs each new database connection to the log file
log_disconnections logs each disconnection to the log file
log_duration logs the duration of every sql statement
log_hostname converts ip addresses to hostnames in the log file
log_line_prefix specifies the prefix for each line reported to the log ( various place holders available )
log_lock_waits if on, every process waiting longer than deadlock_timeout for a lock will be reported
log_statement specifies if and which sql statements will be written to the log file ( none, ddl, mod, all )
log_temp_files specifies if log entry will be written each time a temporary file gets deleted
log_timezone specifies the timezone for the log entries

as you can see, the dba is given much more control about logging than in oracle. it clearly depends on the database and application what should be logged. to start, this set should be appropriate:

export PARAMFILE=/opt/postgres/mydb/postgresql.conf
echo "###### logging settings" >> $PARAMFILE
echo "logging_collector=on" >> $PARAMFILE
echo "log_truncate_on_rotation=on" >> $PARAMFILE
echo "log_filename='postgresql-%a.log'" >> $PARAMFILE
echo "log_rotation_age='8d'" >> $PARAMFILE
echo "log_line_prefix='%m - %l - %p - %u@%d '" >> $PARAMFILE
echo "log_directory='/var/log/'" >> $PARAMFILE
echo "log_min_messages='WARNING'" >> $PARAMFILE
echo "log_autovacuum_min_duration=360s" >> $PARAMFILE
echo "log_error_verbosity=default" >> $PARAMFILE
echo "log_min_error_statement=ERROR" >> $PARAMFILE
echo "log_duration_statement=5min" >> $PARAMFILE
echo "log_checkpoints=on" >> $PARAMFILE
echo "log_statement=ddl" >> $PARAMFILE
echo "client_min_messages='WARNING'" >> $PARAMFILE

once having specified the log settings it is time to think about the memory requirements. compared to the oracle settings there are not too much parameters to specify here:

parameter description
shared_buffers controls the amount of shared memory available to the whole database cluster. the initial size on my box is 32M which is rather small.
temp_buffers controls the amount of buffers used for temporary tables _per_ session.
work_mem the amount of memory used for sort and hash operations per operation
maintenance_work_mem the amount of memory used for maintenance operations such as ACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY

although these settings strongly depend on the database and application requirements and the serves hardware this could be a good start:

echo "###### memory settings" >> $PARAMFILE
echo "shared_buffers=256MB" >> $PARAMFILE
echo "temp_buffers=16MB" >> $PARAMFILE
echo "work_mem=4MB" >> $PARAMFILE
echo "maintenance_work_mem=16MB" >> $PARAMFILE

the next point to think about is the wal ( write ahead log ). as the wal files are essential for consistency and a production system never should go without archived logs these settings are critical. postgresql offers various parameters for controlling this ( only the most important here ):

parameter description
fsync should always be on ( default ) as this controls that comitted transactions are guaranteed to be written to disk
wal_buffers size of the wal buffers inside the databases’ shared memory ( comparable to the log_buffer in oracle )
synchronous_commit if off, asynchronous writes to the wal files are enabled ( loss of transactions may occur, but no data inconsistency )
wal_writer_delay the time frame the wal writer process writes blocks to the wal files ( 200ms by default )
checkpoint_segments the amount of checkpoint segments ( typically 16MB each ) available: comparable to oracle’s amount of redo logs
checkpoint_timeout controls the frequency of checkpoints ( 5 seconds by default )
checkpoint_warning controls how frequent checkpoints may occur until a warning to the log will be written
checkpoint_completion_target controls how fast checkpoints should complete ( 0.0 => fastest, 1.0 => slowest, which means the whole period between to checkpoints )
full_page_writes should be on to enable that the whole pages will be written to disk after the first change after a checkpoint.
wal_level controls how much information is written to the wal files: minimal ( crash recovery ), archive ( wal based recovery ), hot_standby ( read only standby )
archive_mode archiving of the wal files: on/off
archive_command any command used to archive the wal files
archive_timeout controls how often wal archived should be saved
hot_standby enables read only standby ( active dataguard in oracle terms )
max_wal_senders controls the amount of standby databases this master can serve
wal_sender_delay controls how often data gets replicated ( default is 200ms )

a reasonable configuration to start with ( standby databases are not in scope here ) could be:

echo "###### wal settings" >> $PARAMFILE
echo "fsync=on" >> $PARAMFILE
echo "wal_buffers=16MB" >> $PARAMFILE
echo "synchronous_commit=on" >> $PARAMFILE
echo "wal_writer_delay=200ms" >> $PARAMFILE
echo "checkpoint_segments=16" >> $PARAMFILE
echo "checkpoint_timeout=300s" >> $PARAMFILE
echo "checkpoint_warning=30s" >> $PARAMFILE
echo "checkpoint_completion_target=0.9" >> $PARAMFILE
echo "full_page_writes=on" >> $PARAMFILE
echo "wal_level=archive" >> $PARAMFILE
echo "archive_mode=on" >> $PARAMFILE
echo "archive_command='test ! -f /opt/postgres/arch/%f && cp %p /opt/postgres/arch/%f'" >> $PARAMFILE
echo "archive_timeout=10min" >> $PARAMFILE

as the vacuum and analyze processes are such important there are parameters to control this ( the most important here ):

parameter description
autovacuum enables the autovaccum process launcher
autovacuum_max_workers controls how many autovacuum processes will be started
autovacuum_naptime controls the minimum delay between vacuum processes ( defaults to 1 minute )

adding them to the server’s parameter file:

echo "###### autovaccum settings" >> $PARAMFILE
echo "autovacuum=on" >> $PARAMFILE
echo "autovacuum_max_workers=3" >> $PARAMFILE
echo "autovacuum_naptime=5min" >> $PARAMFILE

one more parameter to specify is for loading the pg_stat_statements module from the contrib directory:

echo "###### pg_stat_statements" >> $PARAMFILE
echo "shared_preload_libraries='pg_stat_statements'" >> $PARAMFILE

keep in mind that this is only a set to start with, especially if you do not know how the application will behave. there are a bunch of more parameters which give you much more control over various aspects of the database. check the documentation for the complete reference.

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.