IRIX certification, optimization
From now on code will be built against the release libraries.
Very soon the results in this page will be moved to a different page.
As for now, only a simple comparison has been made between the 2 libraries.
A release executable has been run and compared with a development
executable run on the same data file. As could be expected (hoped?) there is
NO DIFFERENCE (in terms of a Unix diff) between the
oddpack outputs.
There is a slight increase in the execution time (15 ms per event), but this
needs more investigation.
rawall = -1 recall = -2
Furthermore, the code is compiled using the -O option.
time (%) cum time procedure (file)
2.4e+03s ( 9.6) 2.4e+03s spacer_
2.1e+03s ( 8.6) 4.5e+03s lsqall_
2.1e+03s ( 8.4) 6.6e+03s mat2prj_
1.9e+03s ( 7.8) 8.6e+03s __sqrtf
1.4e+03s ( 5.7) 1e+04s lsq1_
1.3e+03s ( 5.2) 1.1e+04s fitsp_
1.2e+03s ( 4.9) 1.2e+04s duesta_
9.3e+02s ( 3.7) 1.3e+04s mat3shr_
7.2e+02s ( 2.9) 1.4e+04s nuova_
5.8e+02s ( 2.3) 1.5e+04s ycorr2_
5.2e+02s ( 2.1) 1.5e+04s tsvmat_
5.2e+02s ( 2.1) 1.6e+04s p0p1_
5.1e+02s ( 2.0) 1.6e+04s ykick_
3.3e+02s ( 1.3) 1.7e+04s eion_
2.8e+02s ( 1.1) 1.7e+04s saga_
2.8e+02s ( 1.1) 1.7e+04s vzero_
2.7e+02s ( 1.1) 1.7e+04s __pow
2.3e+02s ( 0.9) 1.8e+04s ftrace_
2.3e+02s ( 0.9) 1.8e+04s fitvw_
2.2e+02s ( 0.9) 1.8e+04s zeroe_
2e+02s ( 0.8) 1.8e+04s xcorr2_
1.8e+02s ( 0.7) 1.8e+04s __sqrt
1.6e+02s ( 0.7) 1.9e+04s ie_pwc_
1.6e+02s ( 0.6) 1.9e+04s ycorr_
1.4e+02s ( 0.6) 1.9e+04s upkpwc_c_
1.4e+02s ( 0.6) 1.9e+04s newt_
1.3e+02s ( 0.5) 1.9e+04s omu_intersect_
1.2e+02s ( 0.5) 1.9e+04s hist_
1.2e+02s ( 0.5) 1.9e+04s put2prj_
1.2e+02s ( 0.5) 2e+04s pkick_
1.1e+02s ( 0.4) 2e+04s matinv_
1.1e+02s ( 0.4) 2e+04s linchk_
1.1e+02s ( 0.4) 2e+04s xyprim_
1e+02s ( 0.4) 2e+04s dvdcov_
1e+02s ( 0.4) 2e+04s mat3prj_
1e+02s ( 0.4) 2e+04s emunpk_
97s ( 0.4) 2e+04s __log
92s ( 0.4) 2e+04s __logf
91s ( 0.4) 2e+04s lsqmag_
87s ( 0.3) 2.1e+04s gexy_
83s ( 0.3) 2.1e+04s scruta_
81s ( 0.3) 2.1e+04s plot_
78s ( 0.3) 2.1e+04s xcorr_
72s ( 0.3) 2.1e+04s flvp_
71s ( 0.3) 2.1e+04s xkick_
71s ( 0.3) 2.1e+04s cospdb_
70s ( 0.3) 2.1e+04s fit3s_
69s ( 0.3) 2.1e+04s fit5_
68s ( 0.3) 2.1e+04s cluste_
67s ( 0.3) 2.1e+04s pbgcls_
|
*** *** *** TOTAL EVENTS = 38212 *** *** TOTAL TIME = 24343.5 secs *** *** TIME PER EVENT = 637.064 msecs *** *** ***
The same executable has been run on a different data file (9954.02) and execution time is approximately the same:
*** *** *** TOTAL EVENTS = 38095 *** *** TOTAL TIME = 22111.9 secs *** *** TIME PER EVENT = 580.441 msecs *** *** ***When comparing the profiler output, altough the order is different, one finds that approximately the same 10 routines are using most of the CPU:
2.1e+03s ( 9.6) 2.1e+03s lsqall_ 2.1e+03s ( 9.3) 4.2e+03s mat2prj_ 1.7e+03s ( 7.6) 5.8e+03s __sqrtf 1.5e+03s ( 6.9) 7.4e+03s spacer_ 1.4e+03s ( 6.2) 8.7e+03s lsq1_ 1.2e+03s ( 5.4) 9.9e+03s duesta_ 1.1e+03s ( 4.9) 1.1e+04s nuova_ 9.8e+02s ( 4.4) 1.2e+04s fitsp_ 9.7e+02s ( 4.4) 1.3e+04s mat3shr_ 5.1e+02s ( 2.3) 1.3e+04s ycorr2_
Just for fun you can compare the profile from a different namelist
Here you can find the talk I gave about MINI-PASS1
O2 versus O1 studies fro MICRORICO
![]()
Send comments to: Matteo Boschini.
Last update: 04-June-1997