when loading lots and lots of data to the database it is often better for performance to create the indexes and constraints after the load ( given the loaded data is consistent ). if there are some cpus available doing this in parallel would be a nice option so I started to look what perl can do for me in this case. there is a little module available on cpan called Parallel::ForkManager. below is a simple perl script which executes four scripts in parallel using this module. this example uses postgresql but you may replace the psql call with sqlplus and you’re done if you want to use it with oracle.
#!/usr/bin/perl use Parallel::ForkManager; my ( @sqlFiles # the set of sql-scripts to execute , $pm ); push ( @sqlFiles, 'scripts/01.sql' ); push ( @sqlFiles, 'scripts/02.sql' ); push ( @sqlFiles, 'scripts/03.sql' ); push ( @sqlFiles, 'scripts/04.sql' ); my $arraysize = @sqlFiles; # set the amount of forks to the count of the sql script array # do not set this greater than the number of CPUs available $pm = Parallel::ForkManager->new($arraysize); foreach my $file ( @sqlFiles ) { $pm->start and next; # do the fork my $result = `psql -f $file`; print "result: $result \n"; $pm->finish; # do the exit in the child process } 1;
the sql scripts in this case just create an index on each column of a test table:
$ cat scripts/01.sql select pg_backend_pid(); create index i1 on t1(a); $ cat scripts/02.sql select pg_backend_pid(); create index i2 on t1(b); $ cat scripts/03.sql select pg_backend_pid(); create index i3 on t1(c); $ cat scripts/04.sql select pg_backend_pid(); create index i4 on t1(d);
of course it can be anything else but just creating indexes.