linux and its processes

April 4, 2012 — Leave a comment

as mentioned in earlier posts as a dba you need to know how the operating system works. this post is an introduction to processes on linux.

the definition of process is: a process is an instance of a program in execution.
to manage a process the linux kernel must know a lot of things about the process, e.g. which files the process is allowed to handle, if it is running on CPU or blocked, the address space of the process etc. all this information is present in the so called process descriptor. you can think of the process descriptor as a strcuture containing all the information the kernel needs to know about the process ( internally the structure is called: task_structure ). on of the information stored in the process descriptor is the process id which is used to identify the process.

let’s take a look at the processes that make up the oracle database:

ps -ef | grep $ORACLE_SID | egrep -v "DESCRIPTION|grep|tnslsnr" 
oracle    2944     1  0 08:30 ?        00:00:03 ora_pmon_DB112
oracle    2946     1  0 08:30 ?        00:00:06 ora_psp0_DB112
oracle    2948     1  2 08:30 ?        00:01:11 ora_vktm_DB112
oracle    2952     1  0 08:30 ?        00:00:02 ora_gen0_DB112
oracle    2954     1  0 08:30 ?        00:00:03 ora_diag_DB112
oracle    2956     1  0 08:30 ?        00:00:02 ora_dbrm_DB112
oracle    2958     1  0 08:30 ?        00:00:06 ora_dia0_DB112
oracle    2961     1  0 08:30 ?        00:00:02 ora_mman_DB112
oracle    2963     1  0 08:30 ?        00:00:03 ora_dbw0_DB112
oracle    2965     1  0 08:30 ?        00:00:03 ora_lgwr_DB112
oracle    2967     1  0 08:30 ?        00:00:06 ora_ckpt_DB112
oracle    2969     1  0 08:30 ?        00:00:01 ora_smon_DB112
oracle    2971     1  0 08:30 ?        00:00:00 ora_reco_DB112
oracle    2973     1  0 08:30 ?        00:00:02 ora_rbal_DB112
oracle    2975     1  0 08:30 ?        00:00:01 ora_asmb_DB112
oracle    2977     1  0 08:30 ?        00:00:05 ora_mmon_DB112
oracle    2979     1  0 08:30 ?        00:00:09 ora_mmnl_DB112
oracle    2987     1  0 08:30 ?        00:00:04 ora_mark_DB112
oracle    3013     1  0 08:30 ?        00:00:00 ora_qmnc_DB112
oracle    3054     1  0 08:30 ?        00:00:01 ora_q000_DB112
oracle    3056     1  0 08:30 ?        00:00:00 ora_q001_DB112
oracle    3182     1  0 08:35 ?        00:00:02 ora_smco_DB112
oracle    3359     1  0 09:05 ?        00:00:00 ora_w000_DB112

notice that the grep command excluded the listener process and all the current connections to the database.
if you want to check the local connections connections to the database from the os, you can do something like this:

ps -ef | grep $ORACLE_SID | grep "LOCAL=YES"
oracle    2933     1  0 10:48 ?        00:00:01 oracleDB112 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    2969     1  0 10:48 ?        00:00:00 oracleDB112 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    2979     1  0 10:48 ?        00:00:00 oracleDB112 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    3088  3087  0 10:50 ?        00:00:00 oracleDB112 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    3096     1  0 10:50 ?        00:00:00 oracleDB112 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

checking the processes from inside the database would be as simple as this ( for the background processes ):

SQL> select pname from v$process where pname is not null;
PNAME
-----
PMON
PSP0
VKTM
GEN0
DIAG
DBRM
DIA0
MMAN
DBW0
LGWR
CKPT
SMON
RECO
RBAL
ASMB
MMON
MMNL
MARK
SMCO
W000
QMNC
Q000
Q001

with the above arguments ( -ef ) supplied to the ps command, the columns displayed are:

  • the os-user the process runs under
  • the process id
  • the parent process id
  • processor utilization
  • start time of the process
  • the terminal the process was started on ( if any )
  • the cumulative CPU time
  • the command

but where does the ps command get the information to display from ? in fact you can get all of the information displayed above without using the ps command. all you need to do is to check the pseudo filesystem /proc ( it is called a pseudo filesystem because it is a virtual filesystem that maps to the kernel structures ).

if you do a “ls” on the proc filesystem you’ll see a lot of directories and files. for this post we will concentrate on the numbered directories which map to process ids.

let’s take smon as an example, which is the oracle system monitor ( you will need to adjust the process-id for your environment ):

ls -la /proc/2969/
dr-xr-xr-x   6 oracle asmadmin 0 Apr  3 13:40 .
dr-xr-xr-x 154 root   root     0 Apr  3  2012 ..
dr-xr-xr-x   2 oracle asmadmin 0 Apr  3 14:07 attr
-r--------   1 root   root     0 Apr  3 14:07 auxv
-r--r--r--   1 root   root     0 Apr  3 13:45 cmdline
-rw-r--r--   1 root   root     0 Apr  3 14:07 coredump_filter
-r--r--r--   1 root   root     0 Apr  3 14:07 cpuset
lrwxrwxrwx   1 root   root     0 Apr  3 14:07 cwd -> /opt/oracle/product/base/11.2.0.3/dbs
-r--------   1 root   root     0 Apr  3 14:07 environ
lrwxrwxrwx   1 root   root     0 Apr  3 14:07 exe -> /opt/oracle/product/base/11.2.0.3/bin/oracle
dr-x------   2 root   root     0 Apr  3 13:40 fd
dr-x------   2 root   root     0 Apr  3 14:07 fdinfo
-r--------   1 root   root     0 Apr  3 14:07 io
-r--r--r--   1 root   root     0 Apr  3 14:07 limits
-rw-r--r--   1 root   root     0 Apr  3 14:07 loginuid
-r--r--r--   1 root   root     0 Apr  3 13:40 maps
-rw-------   1 root   root     0 Apr  3 14:07 mem
-r--r--r--   1 root   root     0 Apr  3 14:07 mounts
-r--------   1 root   root     0 Apr  3 14:07 mountstats
-r--r--r--   1 root   root     0 Apr  3 14:07 numa_maps
-rw-r--r--   1 root   root     0 Apr  3 14:07 oom_adj
-r--r--r--   1 root   root     0 Apr  3 14:07 oom_score
lrwxrwxrwx   1 root   root     0 Apr  3 14:07 root -> /
-r--r--r--   1 root   root     0 Apr  3 14:07 schedstat
-r--r--r--   1 root   root     0 Apr  3 14:07 smaps
-r--r--r--   1 root   root     0 Apr  3 13:40 stat
-r--r--r--   1 root   root     0 Apr  3 14:07 statm
-r--r--r--   1 root   root     0 Apr  3 13:45 status
dr-xr-xr-x   3 oracle asmadmin 0 Apr  3 14:07 task
-r--r--r--   1 root   root     0 Apr  3 14:07 wchan

what do we see here? lots and lots of information of the smon process. for a detailed description of what all the files and directories are about, you can go to the man-pages:

man proc

for example, if we take a look at the statm file of the process:

cat /proc/2969/statm
126385 16013 14717 45859 0 994 0

… and check the man pages for the meaning of the numbers, things are getting clearer:

/proc/[number]/statm
      Provides information about memory status in pages.  The columns are:
               size       total program size
               resident   resident set size
               share      shared pages
               text       text (code)
               lib        library
               data       data/stack
               dt         dirty pages (unused in Linux 2.6)

wanting to know the environment of the process? just take a look at the environ file:

cat /proc/2969/environ 
__CLSAGFW_TYPE_NAME=ora.listener.typeORA_CRS_HOME=/opt/oracle/product/crs/11.2.0.3SELINUX_INIT=YESCONSOLE=/dev/consoleTERM=linuxSHELL=/bin/bash__CRSD_CONNECT_STR=(ADDRESS=(PROTOCOL=IPC)(KEY=OHASD_IPC_SOCKET_11))NLS_LANG=AMERICAN_AMERICA.AL32UTF8CRF_HOME=/opt/oracle/product/crs/11.2.0.3GIPCD_PASSTHROUGH=false__CRSD_AGENT_NAME=/opt/oracle/product/crs/11.2.0.3/bin/oraagent_grid__CRSD_MSG_FRAME_VERSION=2USER=gridINIT_VERSION=sysvinit-2.86__CLSAGENT_INCARNATION=2ORASYM=/opt/oracle/product/crs/11.2.0.3/bin/oraagent.binPATH=RUNLEVEL=3runlevel=3PWD=/ENV_FILE=/opt/oracle/product/crs/11.2.0.3/crs/install/s_crsconfig_oracleplayground_env.txtLANG=en_US.UTF-8TZ=Europe/Zurich__IS_HASD_AGENT=TRUEPREVLEVEL=Nprevious=N__CLSAGENT_LOG_NAME=ora.listener.type_gridHOME=/home/gridSHLVL=3__CLSAGENT_LOGDIR_NAME=ohasdLD_ASSUME_KERNEL=__CLSAGENT_USER_NAME=gridLOGNAME=gridORACLE_HOME=/opt/oracle/product/base/11.2.0.3ORACLE_SID=DB112ORA_NET2_DESC=34,37ORACLE_SPAWNED_PROCESS=1SKGP_SPAWN_DIAG_PRE_FORK_TS=1333453218SKGP_SPAWN_DIAG_POST_FORK_TS=1333453218SKGP_HIDDEN_ARGS=0SKGP_SPAWN_DIAG_PRE_EXEC_TS=1333453218[root@oracleplayground 2642]# 

… which files were opened by the process ?:

ls -la fd/
total 0
dr-x------ 2 root   root      0 Apr  3 13:40 .
dr-xr-xr-x 6 oracle asmadmin  0 Apr  3 13:40 ..
lr-x------ 1 root   root     64 Apr  3 16:23 0 -> /dev/null
l-wx------ 1 root   root     64 Apr  3 16:23 1 -> /dev/null
lr-x------ 1 root   root     64 Apr  3 16:23 10 -> /dev/null
lr-x------ 1 root   root     64 Apr  3 16:23 11 -> /dev/null
lr-x------ 1 root   root     64 Apr  3 16:23 12 -> /dev/null
lrwx------ 1 root   root     64 Apr  3 16:23 13 -> /opt/oracle/product/base/11.2.0.3/dbs/hc_DB112.dat
lr-x------ 1 root   root     64 Apr  3 16:23 14 -> /dev/null
lr-x------ 1 root   root     64 Apr  3 16:23 15 -> /dev/null
lr-x------ 1 root   root     64 Apr  3 16:23 16 -> /dev/zero
lr-x------ 1 root   root     64 Apr  3 16:23 17 -> /dev/zero
lrwx------ 1 root   root     64 Apr  3 16:23 18 -> /opt/oracle/product/base/11.2.0.3/dbs/hc_DB112.dat
lr-x------ 1 root   root     64 Apr  3 16:23 19 -> /opt/oracle/product/base/11.2.0.3/rdbms/mesg/oraus.msb
l-wx------ 1 root   root     64 Apr  3 16:23 2 -> /dev/null
lr-x------ 1 root   root     64 Apr  3 16:23 20 -> /proc/2875/fd
lr-x------ 1 root   root     64 Apr  3 16:23 21 -> /opt/oracle/product/crs/11.2.0.3/dbs/hc_+ASM.dat
lr-x------ 1 root   root     64 Apr  3 16:23 22 -> /dev/zero
lrwx------ 1 root   root     64 Apr  3 16:23 23 -> /opt/oracle/product/base/11.2.0.3/dbs/hc_DB112.dat
lrwx------ 1 root   root     64 Apr  3 16:23 24 -> /opt/oracle/product/base/11.2.0.3/dbs/lkDB112
lr-x------ 1 root   root     64 Apr  3 16:23 25 -> /opt/oracle/product/base/11.2.0.3/rdbms/mesg/oraus.msb
lrwx------ 1 root   root     64 Apr  3 16:23 256 -> /dev/sda1
lrwx------ 1 root   root     64 Apr  3 16:23 3 -> /opt/oracle/product/crs/11.2.0.3/log/oracleplayground/agent/ohasd/oraagent_grid/oraagent_gridOUT.log
l-wx------ 1 root   root     64 Apr  3 16:23 4 -> /opt/oracle/product/crs/11.2.0.3/log/oracleplayground/agent/ohasd/oraagent_grid/oraagent_grid.l01
lr-x------ 1 root   root     64 Apr  3 16:23 5 -> /dev/null
lrwx------ 1 root   root     64 Apr  3 16:23 6 -> socket:[7791]
lrwx------ 1 root   root     64 Apr  3 16:23 7 -> socket:[7792]
lrwx------ 1 root   root     64 Apr  3 16:23 8 -> socket:[7793]
lrwx------ 1 root   root     64 Apr  3 16:23 9 -> socket:[7794]

conclusion: it’s really worth to read the man pages and understand the /proc/[PID] structures. this can be a very good starting point if you have troubles with one of the processes running on your system.

and last but not least: maybe you don’t believe that the ps command is reading the /proc/[PID] structures to diplay it’s information. you can always trace the commands and check what’s happening behind:

strace -o strace.log ps -ef

this will write the strace output to a file named strace.log. grep for you smon process and check which files were read:

grep 2969 strace.log
stat("/proc/2969", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/2969/stat", O_RDONLY)       = 6
read(6, "2969 (oracle) S 1 2921 2921 0 -1"..., 1023) = 191
open("/proc/2969/status", O_RDONLY)     = 6
open("/proc/2969/cmdline", O_RDONLY)    = 6
write(1, "oracle    2969     1  0 10:48 ? "..., 63) = 63

here we go: a subset of the same files listed above:

/proc/2969/stat
/proc/2969/status
/proc/2969/cmdline

happy processing …

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.