IRIX certification, optimization

Compiling and building our code

What is displayed here are results obtained before June 3rd 1997. Code was compiled against the development libraries.

From now on code will be built against the release libraries.

Very soon the results in this page will be moved to a different page.

As for now, only a simple comparison has been made between the 2 libraries.
A release executable has been run and compared with a development executable run on the same data file. As could be expected (hoped?) there is NO DIFFERENCE (in terms of a Unix diff) between the oddpack outputs.
There is a slight increase in the execution time (15 ms per event), but this needs more investigation.


Where and how the code is run

As for now, the code is run in batch on any IRIX node of the batch system.
Those nodes are single processor 150 MHz SGI R4400Cr's , running IRIX 5.3.
Now, the entire file 10106.02 is analyzed (same as for the AIX certification), for a total of 38212 events.
The same namelist as for the AIX certification is used, i.e.:

Furthermore, the code is compiled using the -O option.


Code stability

As for now, I'm still playing with the profiles...

Optimizations

***					       ***
***	TOTAL EVENTS  =     38212	       ***
***	  TOTAL TIME  =   24343.5     secs     ***
***   TIME PER EVENT  =   637.064     msecs    ***
***					       ***

The same executable has been run on a different data file (9954.02) and execution time is approximately the same:


***					       ***
***	TOTAL EVENTS  =     38095	       ***
***	  TOTAL TIME  =   22111.9     secs     ***
***   TIME PER EVENT  =   580.441     msecs    ***
***					       ***
When comparing the profiler output, altough the order is different, one finds that approximately the same 10 routines are using most of the CPU:

 	2.1e+03s  (  9.6)    2.1e+03s	     lsqall_ 
 	2.1e+03s  (  9.3)    4.2e+03s	    mat2prj_ 
 	1.7e+03s  (  7.6)    5.8e+03s	     __sqrtf 
 	1.5e+03s  (  6.9)    7.4e+03s	     spacer_ 
 	1.4e+03s  (  6.2)    8.7e+03s	       lsq1_ 
	1.2e+03s  (  5.4)    9.9e+03s	     duesta_ 
	1.1e+03s  (  4.9)    1.1e+04s	      nuova_ 
	9.8e+02s  (  4.4)    1.2e+04s	      fitsp_ 
	9.7e+02s  (  4.4)    1.3e+04s	    mat3shr_ 
 	5.1e+02s  (  2.3)    1.3e+04s	     ycorr2_ 

Just for fun you can compare the profile from a different namelist

Here you can find the talk I gave about MINI-PASS1

O2 versus O1 studies fro MICRORICO

Send comments to: Matteo Boschini.
Last update: 04-June-1997