sgi 2000 and sgi 3000

The IPRC has two sgi 2000 and one sgi 3000 servers, each with 32 cpus. One of the sgi 2000s has 250 Mhz cpu, the second has 300 Mhz cpus, each capable of 500 Mflops and 600 Mflops peak. The sgi 3000 has 400 Mhz cpus with a peak speed of 800Mflops.
The operating system is IRIX, sgi's version of Unix.

Autoparallelization

Parallelization of a code may speed up the execution of a code significantly. It can be done using directives in the code, for instance using MPI or OPENMP. The easiest is to use the autoparallelizer provided by sgi. Compile your fortran code using

f90 -O3 -apo mycode.f

f90 -O3 -pfa mycode.f

where the highest possible optimization (- O3) is shown in the examples (check your results using this option!). The -apo option replaces the older -pfa option, but some codes works better using the old parallelizing software.

For further guidance on running parallel code see our brief parallelization hints.

To run a parallel job using OpenMP on the IRIX based sgi's, please use dynamic allocation of processors. For instance, if you can use 12 cpus efficiently, use

setenv OMP_DYNAMIC TRUE
setenv OMP_NUM_THREADS 12

and you will get up to 12 cpus. If the system is busy, you will get less. The first directive enables the sgi to distribute the cpus efficiently so it avoids swapping processes in and out of memory. For MPI jobs the directive is ignored.

How to stop a job automatically if it performs an illegal operation

The SGIs are using IEEE standards and allows division by 0 and working with numbers too large to represent (Not a Number = NaN). For most models and analysis tools this is unwanted. Here's how to dump core if you create numbers with too large absolute values, dividing by zero, or doing illegal operations in your fortran program:

in .cshrc (in your home directory on the SGI) add a line:

setenv TRAP_FPE "OVERFL=ABORT;DIVZERO=ABORT;INVALID=ABORT"

and link to the floating point exception library by adding -lfpe when you compile:

f90 code.f -lfpe

If compiling and linking separately:

f90 -c code.f
f90 code.o -lfpe

only add in link step. In the example above, underflow (values too small to represent and rounded to 0) and inexact operations are allowed.

Batch queues

Batch jobs to oyakata and debu can be submitted to queues using miser . A job in the miser queue will get the requested number of cpus and is great for timing your code on different numbers of cpus.

Home

Tommy G. Jensen      
tjensen@hawaii.edu

Last update: December 16, 2003