on the importance of choosing the right data type

January 7, 2013 — Leave a comment

when working with oracle there is exactly one numeric data type one may use for storing numbers ( and that’s number ). in postgresql there are ten. what happened when we migrated an oracle database to postgresql was, that a source table in oracle took much less space than the same table in postgresql. thanks to some people on the pgsql-performance mailing list one of the reasons for this was, that the wrong numeric data type was chosen for storing the integers in postgresql.

a simple test-case:

drop table datatype1;
drop table datatype2;
create table datatype1
( a numeric ( 8,0 )
, b numeric ( 8,0 )
);
insert into datatype1 ( a, b )
       values ( generate_series ( 1, 10000000 )
              , generate_series ( 1, 10000000 )
              );
create index idatatype1_1 on datatype1 ( a );
create index idatatype1_2 on datatype1 ( b );
create table datatype2
( a int
, b int
);
insert into datatype2 ( a, b )
       values ( generate_series ( 1, 10000000 )
              , generate_series ( 1, 10000000 )
             );
create index idatatype2_1 on datatype2 ( a );
create index idatatype2_2 on datatype2 ( b );
analyze verbose datatype1;
analyze verbose datatype2;
select pg_size_pretty ( pg_relation_size ( 'datatype1' ) );
 pg_size_pretty 
----------------
 422 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'datatype2' ) );
 pg_size_pretty 
----------------
 346 MB
(1 row)

for these two little tables the difference is about 76mb. depending on the statements this is 76mb more that needs to be scanned and this can have great impacts on performance. surprisingly, at least for me, this is not true for the indexes:

select pg_size_pretty ( pg_relation_size ( 'idatatype1_1' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'idatatype2_1' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'idatatype2_1' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 'idatatype2_2' ) );
 pg_size_pretty 
----------------
 214 MB
(1 row)

so, it’s worth to keep an eye on which data types to use for numbers …

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.