IPRC SV1ex


CRAY SV1ex

The IPRC SV1ex is a vector machine and has 32 cpus (500 Mhz) and 32 GB of memory. Each cpu is capable of 2 Gflops per cpu. Four Single Stream Processors (SSP) cpus can be combined to a single Multiple Stream Processor (MSP) capable of 8 Gflops. The IPRC Cray was upgraded in January 2005 from an SV1 to an SV1ex, which doubled the memory speed by a factor of two and increased the clock frequency from 300Mhz to 500Mhz.
The operating system is Unicos, Cray's version of Unix.

Compiling fortran code

The Cray compiler is one of the best compilers available. It is great for debugging code (try compiler option: -R ab). The highest level of optimization is

f90 -O3 -agress -inline4 -scalar3 -vector3 [-stream3] -task3

but should only be used after extensive testing. Stream optimization is only for running on MSPs (see below).

Performance and resource requirements

The Hardware Performance Monitor (hpm) can be used to check on the code performance, e.g. Mflops, vectorization level, memory usage. Another useful utility is Job Accounting (ja), which gives more detailed information on resources. The atexpert utility can provide expected parallel performance of an autotasked code without running in dedicated mode. Add the -eX option to the f90 in addition to -O3. See man pages for hpm, ja and atexpert and for details. Other performance tools are flowtrace and perftrace.

Running a job

The CRAY is primarily a batch job machine. Below is a sample script that shows how to run in a job in the queue called big_q. Also the CRAY binary file is not IEEE standard. If you want to use binary (unformatted) input and output files from your workstation on the CRAY, you can use assign in your script to run your code. One advantage is that you can move the files between the CRAY and your workstation and use them without conversions, and if you use single precision (IEEE-32), the files use only half of the disk space compared to CRAY-binary. The sample script shows how include IEEE format files. In the script I select a queue, combine error and output files into a single file (-eo), a time limit of 33h 20 min, memory needed 64Mwords=512 MB, and a nice value of 20. The file with the script is called cray_run:

#!/bin/csh
#QSUB -q big_q
#QSUB -eo
#QSUB -lT 33:20:00.0
#QSUB -lM 64MW
#QSUB -ln 20
assign -F f77 -N ieee_32 p:%.INI
assign -F f77 -N ieee_64 f:WIND
hpm ./a.out
exit

To run the script type qsub cray_run (make sure it is executable. The command chmod +x cray_run will ensure that it is)

The first assign command informs the system that all files with an ending .INI are IEEE 32 bit format. The second assign command states the the file named WIND is in IEEE 64 bit format (double precision on most workstations). Even if input and output data may only be 32 bits, the CRAY will perform all calculations in 64 bits, the same as double precision (real*8) on other computers. Only in very rare cases does a code need to run in Cray double precision (128 bits), so if variables are declared real*8 or double precision, change it and get a significant speed-up.

The utility hpm runs the executable (a.out) and get performance statistics. See the man page for qsub about the format for submitting jobs and the man page for assign for further details.

Using Multiple-Stream Processors

Instead of combining single vector cpus (SSP) in parallel, faster MSP cpu can be selected. Using a MSP cpu may or may not be faster than auto-tasking on four SSPs. It depends on your code. The command

cpu -a 2 myprog

will run your executable on 2 MSP cpus, created from 8 SSP cpus.The code must be compiled with -stream and -task options (level 1-3) before running. See the man page for cpu and for f90.

Reserving disk space

If you are not sure that there is enough space to run a job, use

assign -n

(see man page)

to reserve space for your files in the beginning of your job. That way it won't fail due to lack of file space. Example: assign -n 500 outputfile will reserve 500 blocks (4096 bytes each) for outputfile when you open it in your program. If the file exists, 500 blocks are added to its size. You can also use setf to create the file prior to running. However, if your open statement in your fortran program declares the file "new", setf will cause your program to fail.

In all cases, make sure to write out restart files so less cpu time is wasted in case of program or hardware failure.

-and remember to delete old files ofcourse.


For help using the Cray:

Cray Online Software Publications (at Texas Advanced Computing Center, University of Texas at Austin)


For more about the SV1 and supercomputers: top500.org about SV1

Home


Tommy G. Jensen      
tjensen@hawaii.edu

Last update: January 15, 2005