quick&dirty « Daniel Westermann's Blog

converting tables to jsonb in postgresql

February 23, 2015 — Leave a comment

as json as a datatype for databases is becoming more and more popular here’s a quick example on how to convert a traditional table design to tables containing jsonb in postgres:

-- traditional design
drop schema if exists traditional cascade;
create schema traditional;
create table traditional.customers ( id serial
                                   , name varchar(20)
                                   , active boolean
                                   , country varchar(2)
                                   , phone varchar(20)
                                   , email varchar(50)
                                   );
alter table traditional.customers add constraint customers_pk primary key(id);

create table traditional.orders ( id serial
                                , customer_id int
                                , order_date timestamp with time zone
                                , delivery_date timestamp with time zone
                                );
alter table traditional.orders add constraint orders_pk 
    primary key(id);
alter table traditional.orders add constraint orders_ref_customers 
    foreign key (customer_id) 
    references traditional.customers(id);

DO
$$
BEGIN
   for i in 1..100 loop
     insert into traditional.customers ( name,active,country
                                       , phone,email )
            values ( 'name'||i, case when mod(i,5) = 0 then true else false end
                   , case when mod(i,3) = 0 then 'CH' else 'DE' end
                   , i, i||'@'||i||'.com' );
            for e in 1..10 loop
              insert into traditional.orders ( customer_id, order_date
                                             , delivery_date )
                     values (i, current_timestamp + interval '5 days'
                            , current_timestamp + interval '7 days' );
            end loop;
   end loop;
END
$$;

-- json design 
drop schema if exists jsonschema cascade;
create schema jsonschema;
create table jsonschema.customers ( id serial
                                  , customer_data jsonb
                                  );
alter table jsonschema.customers add constraint customers_pk 
      primary key(id);

create table jsonschema.orders ( id serial
                               , customer_id int
                               , order_data jsonb
                               );
alter table jsonschema.orders add constraint orders_pk primary key(id);
alter table jsonschema.orders add constraint orders_ref_customers 
     foreign key (customer_id) 
     references traditional.customers(id);

insert into jsonschema.customers ( customer_data )
        ( select row_to_json(cust)::jsonb 
            from ( select name, active, country, phone, email
                     from traditional.customers 
                 ) cust 
        );

with tt (id,order_date,delivery_date) as (
  select id,order_date, delivery_date
    from traditional.orders
  order by id )
,    dd (id,customer_id) as (
  select id, customer_id 
    from traditional.orders
  order by id )
insert into jsonschema.orders ( customer_id, order_data )
   select dd.customer_id 
        , row_to_json(tt)::jsonb 
     from dd, tt
    where dd.id = tt.id;

comparing the different approaches:

postgres=# select * from traditional.customers limit 2;
 id | name  | active | country | phone |  email  
----+-------+--------+---------+-------+---------
  1 | name1 | f      | DE      | 1     | 1@1.com
  2 | name2 | f      | DE      | 2     | 2@2.com
(2 rows)

postgres=# select * from jsonschema.customers limit 2;
 id |                                     customer_data                                     
----+---------------------------------------------------------------------------------------
  1 | {"name": "name1", "email": "1@1.com", "phone": "1", "active": false, "country": "DE"}
  2 | {"name": "name2", "email": "2@2.com", "phone": "2", "active": false, "country": "DE"}

In main, quick&dirty postgres

run a set of scripts in parallel

February 22, 2013 — Leave a comment

when loading lots and lots of data to the database it is often better for performance to create the indexes and constraints after the load ( given the loaded data is consistent ). if there are some cpus available doing this in parallel would be a nice option so I started to look what perl can do for me in this case. there is a little module available on cpan called Parallel::ForkManager. below is a simple perl script which executes four scripts in parallel using this module. this example uses postgresql but you may replace the psql call with sqlplus and you’re done if you want to use it with oracle.

#!/usr/bin/perl
use Parallel::ForkManager;

my ( @sqlFiles     # the set of sql-scripts to execute
   , $pm           
   );
push ( @sqlFiles, 'scripts/01.sql' );
push ( @sqlFiles, 'scripts/02.sql' );
push ( @sqlFiles, 'scripts/03.sql' );
push ( @sqlFiles, 'scripts/04.sql' );
my $arraysize = @sqlFiles;
# set the amount of forks to the count of the sql script array
# do not set this greater than the number of CPUs available
$pm = Parallel::ForkManager->new($arraysize);
foreach my $file ( @sqlFiles ) {
  $pm->start and next;     # do the fork
  my $result = `psql -f $file`;
  print "result: $result \n";
  $pm->finish;             # do the exit in the child process
}
1;

the sql scripts in this case just create an index on each column of a test table:

$ cat scripts/01.sql 
select pg_backend_pid();
create index i1 on t1(a);
$ cat scripts/02.sql 
select pg_backend_pid();
create index i2 on t1(b);
$ cat scripts/03.sql 
select pg_backend_pid();
create index i3 on t1(c);
$ cat scripts/04.sql 
select pg_backend_pid();
create index i4 on t1(d);

of course it can be anything else but just creating indexes.

In main, quick&dirty

jumping around with ssh

December 5, 2012 — Leave a comment

jumping from one host to another because it is not possible to reach the destination host directly can be annoying and time consuming. if you use ssh to connect to your hosts ( and you probably should ) there is an easy way to simplify this.

let’s say you want to jump to host1, from host1 to host2 and from host2 to host3.

here’s the trick ( thanks to a colleague of mine ):

ssh -t -l [USER_ID] -A [HOST1] ssh -t -l [USER_ID] -A [HOST2] ssh -t -l [USER_ID] -A [HOST3]

define an alias for this:

alias my_jump='ssh -t -l [USER_ID] -A [HOST1] ssh -t -l [USER_ID] -A [HOST2] ssh -t -l [USER_ID] -A [HOST3]'

… and it’s easy going:

my_jump

In quick&dirty security

fun with dates when moving from oracle to postgresql

November 29, 2012 — 5 Comments

we are currently in the process of moving an oracle data warehouse to postgresql and had some fun on how oracle handles dates in contrast to postgresql. in oracle, if you subtract one date from another you’ll get an integer:

select to_date('05.01.2012','dd.mm.yyyy') - to_date ( '01.01.2012','dd.mm.yyyy') diff from dual;
      DIFF
----------
         4

if you do the same in postgresql you’ll get an interval:

select to_date('05.01.2012','dd.mm.yyyy') - to_date ( '01.01.2012','dd.mm.yyyy') diff;
  diff
--------
 4 days
(1 row)

the issue was, that a view used this in a where clause and did some calculations based on the return of the expression. obviously this failed when we created the view in postgresql. the trick was to use the to_char function on the interval:

select to_char ( to_date('05.01.2012','dd.mm.yyyy') - to_date ( '01.01.2012','dd.mm.yyyy'), 'DD' ) diff;                                      
diff
------
 04

… we thought :) but the result can still not be used to do some calculations:

select to_char ( to_date('05.01.2012','dd.mm.yyyy') - to_date ( '01.01.2012','dd.mm.yyyy'), 'DD' ) + 8;
ERROR:  operator does not exist: text + integer
LINE 1: ...m.yyyy') - to_date ( '01.01.2012','dd.mm.yyyy'), 'DD' ) + 8;
                                                                   ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

another conversion was necessary to make this work:

select cast ( to_char ( to_date('05.01.2012','dd.mm.yyyy') - to_date ( '01.01.2012','dd.mm.yyyy'), 'DD' ) as int ) + 8;
 ?column?
----------
       12
(1 row)

lessons learned ? oracle is doing implicit type conversions silently in the background. postgresql does not …

another example:

oracle:

SQL> create table t ( a number );
Table created.
SQL> insert into t values ( 20120101 );
1 row created.
SQL>  select to_date(a,'yyyymmdd') from t;
TO_DATE(A,'YYYYMMDD'
--------------------
01-JAN-2012 00:00:00

postgresql:

create table t ( a number );
CREATE TABLE
insert into t values ( 20120101 );
INSERT 0 1
select to_date(a,'yyyymmdd') from t;
ERROR:  function to_date(numeric, unknown) does not exist
LINE 1: select to_date(a,'yyyymmdd') from t;
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

In quick&dirty oracle, postgres

nice ( but not essential ) to know (2)

September 13, 2012 — Leave a comment

did you know ( I didn’t ) that you can provide a star (*) when defining a number column ?

SQL> create table t1 ( a number(*), b number(*,10) );
Table created.
SQL> desc t1;
 Name	     Null?    Type
 ----------- -------- ----------------------------
 A                    NUMBER
 B                    NUMBER(38,10)

In quick&dirty oracle

passing a non default interval to a procedure

August 30, 2012 — Leave a comment

ever wanted to pass a value of type interval to a procedure ? not a big a deal you might think:

create or replace procedure p1 ( a interval day to second ) is
begin
  null;
end;
/
exec p1 ( a => interval '10 00:00:00' day to second );

works as expected. but what if you want to pass an interval with a precision of more than two ?

exec p1 ( a => interval '101 00:00:00' day to second );
BEGIN p1 ( a => interval '101 00:00:00' day to second ); END;
*
ERROR at line 1:
ORA-01873: the leading precision of the interval is too small
ORA-06512: at line 1

should work, shouldn’t it ?
let’s try to specify the precision in the procedure’s declaration:

create or replace procedure p1 ( a interval day(3) to second ) is
begin
  null;
end;
  5  /
Warning: Procedure created with compilation errors.
SQL> show errors;
Errors for PROCEDURE P1:
LINE/COL ERROR
-------- -----------------------------------------------------------------
1/30	 PLS-00103: Encountered the symbol "(" when expecting one of the
	 following:
	 to

ok, it is not possible to specify this in the declaration. can we pass the precision to the procedure along with the parameter ?

exec p1 ( a => interval '101 00:00:00' day(3) to second );
BEGIN p1 ( a => interval '101 00:00:00' day(3) to second ); END;
                                           *
ERROR at line 1:
ORA-06550: line 1, column 44:
PLS-00103: Encountered the symbol "(" when expecting one of the following:
) , * & = - +  at in is mod remainder not rem
  or != or ~= >= <=  and or like like2
like4 likec between to || multiset member submultiset

syntax error, again. what’s wrong ? the documentation tells that the default precision is 2, but can be of any value between 0 and 9. so this must be possible somehow.

the solution is either to move the procedure to a package and to declare a subtype:

create or replace package ptest
is
   subtype my_interval is interval day(3) to second;
   procedure p1 ( v_interval IN my_interval );
end;
/
create or replace package body ptest
is
   procedure p1 ( v_interval IN my_interval ) 
   is
   begin
     null;
   end p1;
end;
/
SQL> exec ptest.p1 ( interval '230 23:0:0.1' day to second );
PL/SQL procedure successfully completed.
SQL>  exec ptest.p1 ( interval '23 23:0:0.1' day to second );
PL/SQL procedure successfully completed.

… or to reference the column of a table:

SQL> create table t1 ( a interval day(3) to second );
Table created.
SQL> create or replace procedure p1 ( a t1.a%type ) is begin null; end;
  2  /
Procedure created.
SQL> exec p1 ( a => interval '101 00:00:00' day to second);
PL/SQL procedure successfully completed.
SQL> exec p1 ( a => interval '10 00:00:00' day to second);
PL/SQL procedure successfully completed.
SQL>

In quick&dirty oracle

ouch! dgmgrl, switchover and the ORA-01017

July 7, 2012 — Leave a comment

you did your dataguard setup and everything seems to be configured correct. the broker configuration is fine, you can cross connect to each instance as sysdba but when performing a switchover through dgmgrl you receive:

ORA-01017 : invalid username/password; logon denied

… and you even provided the sys password while connecting with dgmgrl:

dgmgrl sys/password@somedatabase

what went wrong ? just had this case and it took me an hour to figure it out: the sys password contained special characters. what a …. maybe this is mentioned somewhere in the documentation.

a quick way to test it:

if:

sqlplus sys/"somepasswordwithspecialcharacters"@somedb as sysdba

works, but the same connect without the quotes does not:

sqlplus sys/somepasswordwithspecialcharacters@somedb as sysdba

… try to change the password to anything without special characters and you will succeed ( remember to change the password for the password file, too ) ….

In quick&dirty oracle, ouch!

can you limit the number of users created in the database ?

June 1, 2012 — Leave a comment

… yes, you can. if you use named licensing or want to enforce that not more than n users are created in the database you can set the license_max_users parameter.

alter system set license_max_users=10 scope=both;

but be careful, this includes the oracle internal users:

SELECT count(username)
  FROM dba_users;
COUNT(USERNAME)
---------------
	      9

if you want to create two new users now, you will hit the limit:

SYS@DB112> CREATE USER A IDENTIFIED BY A;
User created.
SYS@DB112> CREATE USER B IDENTIFIED BY B;
CREATE USER B IDENTIFIED BY B
                            *
ERROR at line 1:
ORA-01985: cannot create user as LICENSE_MAX_USERS parameter exceeded
SYS@DB112>

of course this is only valid if you do not share usernames …

In quick&dirty helpers, oracle

sysresv – ever heard of it ?

May 25, 2012 — Leave a comment

just noticed that there is a litte utility one can use for listing the operating resources ( semaphores, shared memory ) an oracle instance uses on a system. the nice thing about this utility is that you can use it even when the instance is down or crashed. this may be useful if some of the resources were not cleaned up and you need to identify which resources you may/must remove.

the documentation tells that this utility is available from 8i onwards.

$ORACLE_HOME/bin/sysresv --help
/opt/oracle/product/base/11.2.0.3/bin/sysresv: invalid option -- -
usage	: sysresv [-if] [-d ] [-l sid1  ...]
	  -i : Prompt before removing ipc resources for each sid
	  -f : Remove ipc resources silently, oevrrides -i option
	  -d : List ipc resources for each sid if on
	  -l sid1  .. : apply sysresv to each sid
Default	: sysresv -d on -l $ORACLE_SID
Note	: ipc resources will be attempted to be deleted for a
	  sid only if there is no currently running instance
	  with that sid.

$ORACLE_HOME/bin/sysresv IPC Resources for ORACLE_SID "DB112" : Shared Memory: ID KEY 2621449 0x00000000 2654218 0x00000000 2686987 0x3393b3a4 Semaphores: ID KEY 262146 0x710dfe10 Oracle Instance alive for sid "DB112"

In quick&dirty helpers, oracle, semaphores

little linux helpers (2)

May 23, 2012 — 1 Comment

as I frequently use the oracle documentation I looked for a quick way to search without clicking around too much. as I am on linux it was quite easy:

echo "tahiti() {
  firefox http://www.oracle.com/pls/db112/search?remark=quick_search\&word=$1 &
}
alias th='tahiti'" >> ~/.bashrc

that’s it ( remember to source the file or to logout/login before this works ). now I am able to search from the command line:

th dbms_stats

of course you can do the same if you use cygwin on windows. happy searching…

ps: thanks to chinmaykamat’s post, which gave me the hint …

In quick&dirty helpers, linux

Daniel Westermann's Blog

Archives For November 30, 1999

converting tables to jsonb in postgresql

run a set of scripts in parallel

jumping around with ssh

fun with dates when moving from oracle to postgresql

nice ( but not essential ) to know (2)

passing a non default interval to a procedure

ouch! dgmgrl, switchover and the ORA-01017

can you limit the number of users created in the database ?

sysresv – ever heard of it ?

little linux helpers (2)

Archives

Recent Posts

RSS Links

Top Posts & Pages

Search

Follow Blog via Email

tags