A Somewhat Painless Introduction to IPROBE
John L. Henning
CSD Performance Group
19 Jun 1997
IPROBE
+ Access low-level chip counters
+ Works with ev4, ev45, ev5, ev56, pca56
+ Works with NT, Unix, VMS
- Hard to get started
- Incomplete documentation
- Lots of pieces to put together
- A few missing floorboards (but work is in progress to fix them - if
you find any, send email to goddard@zko.dec.com)
Purpose of this talk
What you are about to see is true…
A live terminal session
with only minor edits and explanations
Problem:
-tune=ev5
instead of:
-tune=ev4
Parts 1 and 2 are shown on the pages that follow.
Compile it both ways and verify the difference
Script started on Tue Jun 17 06:06:42 1997
%
unlimit%
setenv PARALLEL 4%
pwd/users01/john/swim
%
ls *typescript
e4:
swim.f swim.in swim2.in
e5:
swim.f swim.in swim2.in
%
cd e5%
kf77 -v -V -machine_code -fkapargs='-mc=10000' swim.f/usr/bin/kapf -cmp=./swim.cmp.f swim.f -mc=10000 -conc -tune=EV5 -natural
KAP/Digital_UA_F 3.1a k280615 970519 17-Jun-1997 06:07:39
0 errors in file swim.f
/usr/bin/f77 -fast -automatic -v -V -machine_code ./swim.cmp.f -tune host
-call_shared -lkmp_osfp10 -pthread
. . .
%
time ./a.out <swim.in >swim.out228.55u 0.26s 0:57 399% 0+153k 0+8io 0pf+0w
%
cd ../e4%
kf77 -v -V -machine_code -fkapargs='-mc=10000 -tune=ev4' swim.f/usr/bin/kapf -cmp=./swim.cmp.f swim.f -mc=10000 -tune=ev4 -conc -natural
KAP/Digital_UA_F 3.1a k280615 970519 17-Jun-1997 06:09:43
0 errors in file swim.f
/usr/bin/f77 -fast -automatic -v -V -machine_code ./swim.cmp.f -tune host
-call_shared -lkmp_osfp10 -pthread
. . .
%
time ./a.out < swim.in > swim.out196.58u 0.21s 0:49 399% 0+153k 0+2io 0pf+0w
%
suPassword:
#
mkdir iprobe#
cd iprobe#
ftpftp>
op 16.31.144.83Connected to 16.31.144.83.
220 perf.zko.dec.com FTP server (Digital UNIX Version 5.60) ready.
Name (16.31.144.83:john):
anonymous331 Guest login ok, send ident as password.
Password:
230 Guest login ok, access restrictions apply.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp>
cd pub250 CWD command successful.
ftp>
cd IprobeKits250 CWD command successful.
ftp>
pwd257 "/pub/IprobeKits" is current directory.
ftp>
ls200 PORT command successful.
150 Opening ASCII mode data connection for file list (16.31.32.191,1039).
Iprobe.ps
WhatsHere.txt
Iprobe020.a
IprobeVms021.a
IprobeVms022.a
Nt35IprobeT21.zip
Nt35IprobeT21Update.zip
Nt40IprobeT22Ev4.zip
Nt40IprobeT23Ev5.zip
UnzipAxp.exe
IprNew.mod
Iprobe020ProgrammingKit.bck
Api.doc
Iprobe021Osf.tar.Z
Iprobe0221Unix40.tar.Z
TurboLaserBusMonitorUnix.tar.Z
226 Transfer complete.
ftp>
bin200 Type set to I.
ftp>
get Iprobe0221Unix40.tar.Z200 PORT command successful.
150 Opening BINARY mode data connection for Iprobe0221Unix40.tar.Z (16.31.32.191,1040) (692920 bytes).
226 Transfer complete.
692920 bytes received in 4.2 seconds (1.6e+02 Kbytes/s)
ftp>
bye221 Goodbye.
Install IPROBE
#
lsIprobe0221Unix40.tar.Z
#
zcat * | tar xvf -blocksize = 16
x ./IPRTEST
x ./IPRTEST/instctrl
x ./IPRTEST/instctrl/IPRBASE221.inv, 3284 bytes, 7 tape blocks
x ./IPRTEST/instctrl/IPRBASE221.ctrl, 168 bytes, 1 tape blocks
x ./IPRTEST/instctrl/IPRBASE221.scp, 4187 bytes, 9 tape blocks
x ./IPRTEST/instctrl/IPRBASE221.image, 24 bytes, 1 tape blocks
x ./IPRTEST/IPRBASE221, 1505280 bytes, 2940 tape blocks
x ./IPRTEST/INSTCTRL, 20480 bytes, 40 tape blocks
x ./IPRTEST/.image, 24 bytes, 1 tape blocks
x ./IPRTEST/IPRBASE221.image, 24 bytes, 1 tape blocks
#
cd IPRTEST#
ls.image IPRBASE221 instctrl
INSTCTRL IPRBASE221.image
#
setld -l .
The subsets listed below are optional:
There may be more optional subsets than can be presented on a single
screen. If this is the case, you can choose subsets screen by screen
or all at once on the last screen. All of the choices you make will
be collected for your confirmation before any subsets are installed.
1) IPROBE kit for Digital Unix 4.0
Or you may choose one of the following options:
2) ALL of the above
3) CANCEL selections and redisplay menus
4) EXIT without installing any subsets
Enter your choices or press RETURN to redisplay menus.
Choices (for example, 1 2 4-6):
1
You are installing the following optional subsets:
IPROBE kit for Digital Unix 4.0
Is this correct? (y/n):
y
Checking file system space required to install selected subsets:
File system space checked OK.
1 subset(s) will be installed.
Loading 1 of 1 subset(s)....
Checking for previous IPROBE kits...
No previous IPROBE installation found
IPROBE kit for Digital Unix 4.0
Copying from . (disk)
Verifying
1 of 1 subset(s) installed successfully.
Configuring "IPROBE kit for Digital Unix 4.0" (IPRBASE221)
Copying Digital Unix V4.0 specific files to target directories...
Done
Configuring and loading the IPROBE driver...
IPROBE for Digital Unix 4.0 succesfully installed
#
exit
The actual installation of IPROBE is the only part that requires privileges. At this point we return to non-privileged mode.
Note that a reboot of the system is NOT required - IPROBE has a loadable driver.
Test that IPROBE was installed correctly
%
rehash%
iprobeNode name : gemosf.zko.dec.com
OS : OSF1 T4.0-738.5
CPU count : 4
Model : Unknown
Memory size : 381 MB
Counter count : 3
cycles : Low frequency
Current time : Tue Jun 17 06:15:44 1997
Start time: : immediate
Duration : 0 (until user interrupts)
Interval : 1
Method : count
Measured Modes : all modes
Measured Data : pid ctr ps pc
Buffer_count : 3
Buffer_size : 8192
time cpu freq event # events evts/sec
06:15:44 0 2^16 cycles 399310848 399310848
06:15:44 1 2^16 cycles 399310848 399310848
06:15:44 2 2^16 cycles 399310848 399310848
06:15:44 3 2^16 cycles 399310848 399310848
06:15:45 0 2^16 cycles 399572992 399572992
06:15:45 1 2^16 cycles 399572992 399572992
06:15:45 2 2^16 cycles 399572992 399572992
06:15:45 3 2^16 cycles 399572992 399572992
06:15:46 0 2^16 cycles 399638528 399638528
06:15:46 1 2^16 cycles 399638528 399638528
06:15:46 2 2^16 cycles 399638528 399638528
06:15:46 3 2^16 cycles 399572992 399572992
06:15:47 0 2^16 cycles 79495168 79495168
06:15:47 1 2^16 cycles 79429632 79429632
06:15:47 2 2^16 cycles 79429632 79429632
06:15:47 3 2^16 cycles 79495168 79495168
Total event count:
06:15:47 0 2^16 cycles 1278017536
06:15:47 1 2^16 cycles 1277952000
06:15:47 2 2^16 cycles 1277952000
06:15:47 3 2^16 cycles 1277952000
(
^C to terminate)
The display shows us that IPROBE sees all 4 CPUs, running at 400 MHz.
Create a script to invoke IPROBE and the benchmark
%
pwd/users01/john/swim/e4
%
cd ..%
cat >run_cycl_and_bmiss.cshset verbose
unlimit
set notify
setenv PARALLEL 4
iprobe -quiet -method sample cycles bcache_miss &
time ./a.out <swim.in >swim.out
kill %iprobe
unset verbose
IPROBE is run in the background and killed by the script.
Why cshell? Handy builtin commands such as time and kill
Usage:
csh
% source run_cycl_and_bmiss.csh
because otherwise the
kill would fail:kill %iprobe
%iprobe: No such process
-quiet
: don’t litter my screen-method sample
: tuck events away for later analysiscycles bcache_miss
: the events we want
à
Pick your two favorite events and always start there
Collect the events
%
cd e4%
source ../run_cycl_and_bmiss.cshunlimit
set notify
setenv PARALLEL 4
iprobe -quiet -method sample cycles bcache_miss &
[1] 3529
time ./a.out < swim.in > swim.out
Start of sampling
196.35u 0.53s 0:49 396% 0+153k 0+5io 0pf+0w
kill %iprobe
unset verbose
%
[1] Terminated iprobe -quiet -method sample cycles bcache_miss
%
cd ../e5%
!sour% source ../run_cycl_and_bmiss.csh
unlimit
set notify
setenv PARALLEL 4
iprobe -quiet -method sample cycles bcache_miss &
[1] 3535
time ./a.out < swim.in > swim.out
Start of sampling
238.44u 0.59s 1:00 397% 0+153k 0+5io 0pf+0w
kill %iprobe
unset verbose
%
[1] Terminated iprobe -quiet -method sample cycles bcache_miss
%
cd ..%
ls *run_cycl_and_bmiss.csh typescript
e4:
a.out swim.cmp.f swim.f swim.out
pcsample.dat swim.cmp.l swim.in swim2.in
e5:
a.out swim.cmp.f swim.f swim.out
pcsample.dat swim.cmp.l swim.in swim2.in
Grab John’s data reduction harness
%
ftpftp>
op 16.31.144.83Connected to 16.31.144.83.
220 perf.zko.dec.com FTP server (Digital UNIX Version 5.60) ready.
Name (16.31.144.83:john):
anonymous331 Guest login ok, send ident as password.
Password:
ftp>
cd pub/henningftp>
get harness.pl200 PORT command successful.
150 Opening BINARY mode data connection for harness.pl (16.31.32.191,1045) (22835 bytes).
226 Transfer complete.
22835 bytes received in 0.032 seconds (7e+02 Kbytes/s)
ftp>
bye221 Goodbye.
%
chmod +x harness.pl
%
head harness.pl#!/usr/local/bin/perl
# "harness.pl" - this edition noon 24 Mar 97
#
# Assist with data reduction for IPROBE pc samples.
#
# One of the common complaints about IPROBE is that the supporting tools
# for data reduction are hard to use. This script tries to make things
# easier in two ways:
#
# 1) By providing a harness for data reduction which may meet users'
%
%
ls /usr/local/bin/perl/usr/local/bin/perl
If you don’t already have perl (gasp!) see
http://www.perl.com/perl/info/software.html
If you do have it, be sure to change the second line of the script if your copy is not in /usr/local/bin
Report bugs in harness.pl to henning@zko.dec.com or PERFOM::INTERNAL_PERF_TOOLS note 89
Attempt to invoke the harness, discover the non_shared floorboard
%
cd e4%
lsa.out swim.cmp.f swim.f swim.out
pcsample.dat swim.cmp.l swim.in swim2.in
%
../harness.pl -x a.out -d pcsample.dat -e cyclesOs=unix
Running rep to create addresses.resolved
rep a.out
You did remember to include a ./, didn't you?
Your output file will be addresses.resolved
SPEC benchmark 102.swim
Time passes… press Control-C
Generating top-level report for cycles
ipreduce -input_file pcsample.dat -output_file cycles.rpt -event cycles -pthresh 1
forrtl: info: Fortran error message number is 69.
forrtl: warning: Could not open message catalog: for_msg.cat.
forrtl: info: Check environment variable NLSPATH and protection of /usr/lib/nls/msg/en_US.ISO8859-1/for_msg.cat.
forrtl: error (69): Message not found
can't open cycles.rpt at ../harness.pl line 245.
%
pwd/users01/john/swim/e4
%
./a.out <swim.in >swim.out &[1] 3552
%
rep -pid 3552
Your output file will be addresses.resolved
Resolving addresses
Writing output file addresses.resolved
%
kill %1%
cd ../e5%
./a.out < swim.in >swim.out &[1] 3561
%
rep -pid 3561
Your output file will be addresses.resolved
Resolving addresses
Writing output file addresses.resolved
%
kill %1
Try the harness again…
%
cd e4%
../harness.pl -x a.out -d pcsample.dat -e cyclesOs=unix
Generating top-level report for cycles
ipreduce -input_file pcsample.dat -output_file cycles.rpt -event cycles
-pthresh 1
ipreduce -o cycles_pkcalc3__.rpd -d pc -event cycles -input_file pcsample.dat
-pc 12000D430:12000D8AF
dis -h -p pkcalc3__ a.out > pkcalc3__.dis_tmp
Annotating pkcalc3__
ipreduce -o cycles_pkcalc2__.rpd -d pc -event cycles -input_file pcsample.dat
-pc 12000C350:12000C8AF
dis -h -p pkcalc2__ a.out > pkcalc2__.dis_tmp
Annotating pkcalc2__
ipreduce -o cycles_pkcalc1__.rpd -d pc -event cycles -input_file pcsample.dat
-pc 12000B690:12000BAFF
dis -h -p pkcalc1__ a.out > pkcalc1__.dis_tmp
Annotating pkcalc1__
ipreduce -o cycles_spin_wait_join_barrier.rpd -d pc -event cycles -input_file
pcsample.dat -pc 1200B6088:1200B61CF
dis -h -p spin_wait_join_barrier a.out > spin_wait_join_barrier.dis_tmp
Annotating spin_wait_join_barrier
%
../harness.pl -x a.out -d pcsample.dat -e bcache_missOs=unix
Generating top-level report for bcache_miss
ipreduce -input_file pcsample.dat -output_file bcache_miss.rpt -event
bcache_miss -pthresh 1
ipreduce -o bcache_miss_pkcalc2__.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000C350:12000C8AF
Annotating pkcalc2__
ipreduce -o bcache_miss_pkcalc3__.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000D430:12000D8AF
Annotating pkcalc3__
ipreduce -o bcache_miss_pkcalc1__.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000B690:12000BAFF
Annotating pkcalc1__
ipreduce -o bcache_miss_calc3_.rpd -d pc -event bcache_miss -input_file p
csample.dat -pc 12000CDD0:12000D42F
dis -h -p calc3_ a.out > calc3_.dis_tmp
Annotating calc3_
%
cd ../e5%
../harness.pl -x a.out -d pcsample.dat -e cyclesOs=unix
Generating top-level report for cycles
ipreduce -input_file pcsample.dat -output_file cycles.rpt -event cycles
-pthresh 1
ipreduce -o cycles_pkcalc3__.rpd -d pc -event cycles -input_file pcsample.dat
-pc 12000C510:12000C93F
dis -h -p pkcalc3__ a.out > pkcalc3__.dis_tmp
Annotating pkcalc3__
ipreduce -o cycles_pkcalc2__.rpd -d pc -event cycles -input_file pcsample.dat
-pc 12000B790:12000B9DF
dis -h -p pkcalc2__ a.out > pkcalc2__.dis_tmp
Annotating pkcalc2__
ipreduce -o cycles_pkcalc1__.rpd -d pc -event cycles -input_file pcsample.dat
-pc 12000B040:12000B22F
dis -h -p pkcalc1__ a.out > pkcalc1__.dis_tmp
Annotating pkcalc1__
ipreduce -o cycles_spin_wait_join_barrier.rpd -d pc -event cycles -input_file
pcsample.dat -pc 1200B5118:1200B525F
dis -h -p spin_wait_join_barrier a.out > spin_wait_join_barrier.dis_tmp
Annotating spin_wait_join_barrier
%
../harness.pl -x a.out -d pcsample.dat -e bcache_missOs=unix
Generating top-level report for bcache_miss
ipreduce -input_file pcsample.dat -output_file bcache_miss.rpt -event
bcache_miss -pthresh 1
ipreduce -o bcache_miss_pkcalc2__.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000B790:12000B9DF
Annotating pkcalc2__
ipreduce -o bcache_miss_pkcalc3__.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000C510:12000C93F
Annotating pkcalc3__
ipreduce -o bcache_miss_pkcalc1__.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000B040:12000B22F
Annotating pkcalc1__
ipreduce -o bcache_miss_calc3_.rpd -d pc -event bcache_miss -input_file
pcsample.dat -pc 12000BF00:12000C50F
dis -h -p calc3_ a.out > calc3_.dis_tmp
Annotating calc3_
Results of harness.pl
%
ls e4 e5
e4:a.out pkcalc1__.disaddresses.resolved pkcalc1__.source_bcache_missbcache_miss.hot_routines pkcalc1__.source_cyclesbcache_miss.rpt pkcalc2__.dibbcache_miss_calc3_.rpd pkcalc2__.disbcache_miss_pkcalc1__.rpd pkcalc2__.source_bcache_missbcache_miss_pkcalc2__.rpd pkcalc2__.source_cyclesbcache_miss_pkcalc3__.rpd pkcalc3__.dibcalc3_.dib pkcalc3__.discalc3_.dis pkcalc3__.source_bcache_misscalc3_.source_bcache_miss pkcalc3__.source_cyclescycles.hot_routines swim.cmp.fcycles.rpt swim.cmp.lcycles_pkcalc1__.rpd swim.fcycles_pkcalc2__.rpd swim.incycles_pkcalc3__.rpd swim.outpcsample.dat swim2.inpkcalc1__.dibe5:a.out pkcalc1__.disaddresses.resolved pkcalc1__.source_bcache_missbcache_miss.hot_routines pkcalc1__.source_cyclesbcache_miss.rpt pkcalc2__.dibbcache_miss_calc3_.rpd pkcalc2__.disbcache_miss_pkcalc1__.rpd pkcalc2__.source_bcache_missbcache_miss_pkcalc2__.rpd pkcalc2__.source_cyclesbcache_miss_pkcalc3__.rpd pkcalc3__.dibcalc3_.dib pkcalc3__.discalc3_.dis pkcalc3__.source_bcache_misscalc3_.source_bcache_miss pkcalc3__.source_cyclescycles.hot_routines swim.cmp.fcycles.rpt swim.cmp.lcycles_pkcalc1__.rpd swim.fcycles_pkcalc2__.rpd swim.incycles_pkcalc3__.rpd swim.outpcsample.dat swim2.inpkcalc1__.dib
Hot Routines Report, first clues from disassembly
%
cat e4/cycles.hot_routinesHot Routines for cycles -pthresh 1
Events % Routine Image Addr
434186 36 pkcalc3__ a.out 12000D430:12000D8AF
352761 29 pkcalc2__ a.out 12000C350:12000C8AF
267796 22 pkcalc1__ a.out 12000B690:12000BAFF
101029 8 spin_wait_join_barrier a.out 1200B6088:1200B61CF
%
cat e5/cycles.hot_routinesHot Routines for cycles -pthresh 1
Events % Routine Image Addr
504803 34 pkcalc3__ a.out 12000C510:12000C93F
422918 29 pkcalc2__ a.out 12000B790:12000B9DF
278563 19 pkcalc1__ a.out 12000B040:12000B22F
199509 14 spin_wait_join_barrier a.out 1200B5118:1200B525F
Notice that calc2 has 20% more cycles.
%
cd e4%
wc -l pkcalc2__.dis348 pkcalc2__.dis
%
cd ../e5%
wc -l pkcalc2__.dis152 pkcalc2__.dis
Note that there are many fewer instructions in the ev5 version. This is usually a bad sign, often indicating that a loop was not unrolled. (Rule of thumb: For integer programs, you care about the istream and want it to be small. For floating point programs, you care about the dstream and will spend lots of instructions to get better dstream flow.)
Source cycles report EV5
%
cat pkcalc2__.source_cyclescycles for pkcalc2__ by source line
printing lines with at least 4229.18 events
swim 983 18564
swim 984 102647
swim 985 93550
swim 986 25799
swim 987 61730
swim 988 52982
swim 989 21188
swim 990 38300
swim 991 7957
%
head -993 swim.cmp.f | tail -12DO J1=II1,II2
DO I1=1,M1
UNEW1(I1+1,J1) = UOLD1(I1+1,J1) + TDTS81 * (Z1(I1+1,J1+1) + Z1(
X I1+1,J1)) * (CV1(I1+1,J1+1) + CV1(I1,J1+1) + CV1(I1,J1) + CV1
X (I1+1,J1)) - TDTSDX1 * (H1(I1+1,J1) - H1(I1,J1))
VNEW1(I1,J1+1) = VOLD1(I1,J1+1) - TDTS81 * (Z1(I1+1,J1+1) + Z1(
X I1,J1+1)) * (CU1(I1+1,J1+1) + CU1(I1,J1+1) + CU1(I1,J1) + CU1
X (I1+1,J1)) - TDTSDY1 * (H1(I1,J1+1) - H1(I1,J1))
PNEW1(I1,J1) = POLD1(I1,J1) - TDTSDX1 * (CU1(I1+1,J1) - CU1(I1,
X J1)) - TDTSDY1 * (CV1(I1,J1+1) - CV1(I1,J1))
END DO
END DO
The action is all in one place - and KAP has not unrolled the loop
Source Cycles EV4
%
cd ../e4%
!cat% cat pkcalc2__.source_cycles
cycles for pkcalc2__ by source line
printing lines with at least 3527.61 events
swim 1559 33234
swim 1560 10864
swim 1561 8004
swim 1565 13459
swim 1567 4733
swim 1568 5978
swim 1569 10359
swim 1570 4670
swim 1573 10287
swim 1577 8946
swim 1578 8405
swim 1579 13754
swim 1582 6124
swim 1589 4057
swim 1595 19931
swim 1596 25718
swim 1597 6779
swim 1598 9580
swim 1600 3598
swim 1603 3984
swim 1607 7356
swim 1608 9548
swim 1609 7021
swim 1610 11351
swim 1611 6496
swim 1612 8425
swim 1617 4728
swim 1625 3693
swim 1626 20320
swim 1627 11306
But with
-tune=ev4, the action seems more spread out…
EV4 KAP Source Code
%
head -1630 swim.cmp.f | tail -70RR51 = Z1(I1+II21,J1+1) + Z1(I1+II21,J1)
RR131 = TDTS81 * RR111
RR141 = TDTS81 * RR121
RR41 = TDTS81 * RR51
RR151 = CV1(I1+1,J1+1) + CV1(I1,J1+1)
RR161 = CV1(I1+II11,J1+1) + CV1(I1+II31,J1+1)
RR81 = CV1(I1+II21,J1+1) + CV1(I1+II11,J1+1)
RR171 = RR151 + CV1(I1,J1)
RR181 = RR161 + CV1(I1+II31,J1)
RR71 = RR81 + CV1(I1+II11,J1)
RR191 = RR171 + CV1(I1+1,J1)
RR201 = RR181 + CV1(I1+II11,J1)
RR61 = RR71 + CV1(I1+II21,J1)
RR211 = RR131 * RR191
RR221 = RR141 * RR201
RR31 = RR41 * RR61
RR231 = UOLD1(I1+1,J1) + RR211
RR241 = UOLD1(I1+II11,J1) + RR221
RR29 = UOLD1(I1+II21,J1) + RR31
RR251 = H1(I1+1,J1) - H1(I1,J1)
RR261 = H1(I1+II11,J1) - H1(I1+II31,J1)
RR101 = H1(I1+II21,J1) - H1(I1+II11,J1)
RR271 = TDTSDX1 * RR251
RR281 = TDTSDX1 * RR261
RR91 = TDTSDX1 * RR101
UNEW1(I1+1,J1) = RR231 - RR271
UNEW1(I1+II11,J1) = RR241 - RR281
UNEW1(I1+II21,J1) = RR29 - RR91
RR111 = Z1(I1+1,J1+1) + Z1(I1,J1+1)
RR121 = Z1(I1+II11,J1+1) + Z1(I1+II31,J1+1)
RR51 = Z1(I1+II21,J1+1) + Z1(I1+II11,J1+1)
RR131 = TDTS81 * RR111
RR141 = TDTS81 * RR121
RR41 = TDTS81 * RR51
RR151 = CU1(I1+1,J1+1) + CU1(I1,J1+1)
RR161 = CU1(I1+II11,J1+1) + CU1(I1+II31,J1+1)
RR81 = CU1(I1+II21,J1+1) + CU1(I1+II11,J1+1)
RR171 = RR151 + CU1(I1,J1)
RR181 = RR161 + CU1(I1+II31,J1)
RR71 = RR81 + CU1(I1+II11,J1)
RR191 = RR171 + CU1(I1+1,J1)
RR201 = RR181 + CU1(I1+II11,J1)
RR61 = RR71 + CU1(I1+II21,J1)
RR211 = RR131 * RR191
RR221 = RR141 * RR201
RR31 = RR41 * RR61
RR231 = VOLD1(I1,J1+1) - RR211
RR241 = VOLD1(I1+II31,J1+1) - RR221
RR29 = VOLD1(I1+II11,J1+1) - RR31
RR251 = H1(I1,J1+1) - H1(I1,J1)
RR261 = H1(I1+II31,J1+1) - H1(I1+II31,J1)
RR101 = H1(I1+II11,J1+1) - H1(I1+II11,J1)
RR271 = TDTSDY1 * RR251
RR281 = TDTSDY1 * RR261
RR91 = TDTSDY1 * RR101
VNEW1(I1,J1+1) = RR231 - RR271
VNEW1(I1+II31,J1+1) = RR241 - RR281
VNEW1(I1+II11,J1+1) = RR29 - RR91
RR131 = CU1(I1+1,J1) - CU1(I1,J1)
RR141 = CU1(I1+II11,J1) - CU1(I1+II31,J1)
RR41 = CU1(I1+II21,J1) - CU1(I1+II11,J1)
RR211 = TDTSDX1 * RR131
RR221 = TDTSDX1 * RR141
RR31 = TDTSDX1 * RR41
RR231 = POLD1(I1,J1) - RR211
RR241 = POLD1(I1+II31,J1) - RR221
RR29 = POLD1(I1+II11,J1) - RR31
RR191 = CV1(I1,J1+1) - CV1(I1,J1)
RR201 = CV1(I1+II31,J1+1) - CV1(I1+II31,J1)
RR61 = CV1(I1+II11,J1+1) - CV1(I1+II11,J1)
KAP has unrolled by 3x - note the 3 stores:
UNEW1(I1+1,J1) = RR231 - RR271
UNEW1(I1+II11,J1) = RR241 - RR281
UNEW1(I1+II21,J1) = RR29 - RR91
The constants are defined (in this routine) as
II11=2 and II21=3:
%
grep -n II11 swim.cmp.f | grep PARA1064: PARAMETER (II11 = 2)
1459: PARAMETER (II11 = 2)
2038: PARAMETER (II11 = 3)
%
grep -n II21 swim.cmp.f | grep PARA1055: PARAMETER (II21 = 3)
1448: PARAMETER (II21 = 3)
2150: PARAMETER (II21 = 4)
% grep -n II31 swim.cmp.f | grep PARA
1072: PARAMETER (II31 = 1)
1467: PARAMETER (II31 = 1)
2054: PARAMETER (II31 = 1)
EV5 disassembly
%
cd ../e5%
cat pkcalc2__.disCycle=cycles
BMis=bcache_miss
pkcalc2__:
file line addr Instr Cycle BMis
...
swim 985 12000b8d8 lds $f13,2052(r18) 7251 4
swim 985 12000b8dc lds $f12,2056(r18) 66 1
swim 988 12000b8e0 lds $f15,2052(r6) 3914
swim 988 12000b8e4 lds $f14,2056(r6) 60
swim 984 12000b8e8 lds $f11,0(r19) 3687
swim 983 12000b8ec addl r16,1,r16
swim 984 12000b8f0 lds $f18,-2052(r19) 3590
swim 983 12000b8f4 cmple r16,r1,r24
swim 983 12000b8f8 lda r23,4(r23) 3477
swim 983 12000b8fc lda r26,4(r26)
swim 983 12000b900 lda r19,4(r19) 3765
swim 983 12000b904 lda r22,4(r22)
swim 983 12000b908 lda r8,4(r8) 3634
swim 983 12000b90c lda r6,4(r6)
swim 983 12000b910 lda r18,4(r18) 3965
swim 983 12000b914 lda r17,4(r17)
swim 987 12000b918 lds $f19,-8(r19) 7330 32
swim 987 12000b91c lda r21,4(r21)
swim 985 12000b920 adds $f12,$f13,$f12 68442 1205
swim 983 12000b924 lda r27,4(r27)
swim 988 12000b928 lds $f16,-4(r6) 209
swim 988 12000b92c adds $f14,$f15,$f14 25107 277
swim 988 12000b930 lds $f17,0(r6) 3983
swim 984 12000b934 adds $f11,$f18,$f18 29849 725
swim 985 12000b938 lds $f22,0(r18)
swim 986 12000b93c lds $f24,-4(r22) 3566 2
swim 987 12000b940 adds $f11,$f19,$f11 13322 22
swim 984 12000b944 muls $f0,$f18,$f18 6427
swim 986 12000b948 lds $f25,0(r22)
swim 988 12000b94c adds $f14,$f16,$f14 12049 88
swim 985 12000b950 lds $f20,-4(r18) 3455
swim 990 12000b954 subs $f17,$f16,$f21 5842 38
swim 987 12000b958 muls $f0,$f11,$f11 542
swim 989 12000b95c lds $f26,2048(r22) 133 2
swim 988 12000b960 adds $f14,$f17,$f14 7660
swim 990 12000b964 lds $f23,-4(r8) 112
swim 990 12000b968 muls $f1,$f21,$f21 6373 1
swim 984 12000b96c lds $f27,-4(r23) 4842 72
swim 986 12000b970 subs $f25,$f24,$f25 21816 171
swim 987 12000b974 muls $f11,$f14,$f11 1165
swim 987 12000b978 lds $f28,-4(r26) 209
swim 985 12000b97c adds $f12,$f20,$f12 7693 19
swim 991 12000b980 subs $f13,$f20,$f13 3743
swim 989 12000b984 subs $f26,$f24,$f24 17159 244
swim 986 12000b988 muls $f1,$f25,$f25 411
swim 985 12000b98c adds $f12,$f22,$f12 6643 2
swim 991 12000b990 muls $f10,$f13,$f13 4196
swim 990 12000b994 subs $f23,$f21,$f21 18804 510
swim 989 12000b998 muls $f10,$f24,$f24 3869
swim 984 12000b99c muls $f18,$f12,$f12 6513 1
swim 987 12000b9a0 subs $f28,$f11,$f11 30419 743
swim 990 12000b9a4 subs $f21,$f13,$f13 3740
swim 984 12000b9a8 adds $f27,$f12,$f12 29793 621
swim 987 12000b9ac subs $f11,$f24,$f11 4043
swim 990 12000b9b0 sts $f13,-4(r17) 3368 2
swim 984 12000b9b4 subs $f12,$f25,$f12 6793
swim 987 12000b9b8 sts $f11,-4(r27) 4462 1
swim 984 12000b9bc sts $f12,-4(r21) 11045 3
swim 983 12000b9c0 bne r24,12000b8d8 3686
EV4 Disassembly
%
cd ../e4%
cat pkcalc2__.disCycle=cycles
BMi=bcache_miss
pkcalc2__:
file line addr Instr Cycle BMi
...
swim 1565 12000c4b4 lds $f12,2052(r24) 3682 2
swim 1566 12000c4b8 lds $f22,2060(r24) 69
swim 1565 12000c4bc lds $f11,2056(r24) 1199
swim 1567 12000c4c0 lds $f24,2064(r24) 1250
swim 1559 12000c4c4 lds $f14,0(r26) 106 1
swim 1561 12000c4c8 lds $f20,8(r26) 1319 3
swim 1559 12000c4cc lds $f15,-2052(r26) 1135
swim 1568 12000c4d0 lds $f16,0(r24) 4086 22
swim 1558 12000c4d4 addl r17,3,r17
swim 1570 12000c4d8 lds $f25,8(r24) 3359 29
swim 1558 12000c4dc cmple r17,r16,r19
swim 1569 12000c4e0 lds $f19,4(r24) 9068 133
swim 1558 12000c4e4 lda r21,12(r21)
swim 1558 12000c4e8 lda r22,12(r22) 1287 1
swim 1558 12000c4ec lda r26,12(r26)
swim 1573 12000c4f0 lds $f30,12(r24) 9076 161
swim 1558 12000c4f4 lda r27,12(r27)
swim 1595 12000c4f8 lds $f27,2052(r6) 3332 6
swim 1558 12000c4fc lda r8,12(r8)
swim 1596 12000c500 lds $f3,2060(r6) 22240 396
swim 1565 12000c504 adds $f11,$f12,$f13 8578 121
swim 1558 12000c508 lda r23,12(r23)
swim 1595 12000c50c lds $f26,2056(r6) 9675 185
swim 1597 12000c510 lds $f4,2064(r6) 3208 14
swim 1559 12000c514 adds $f14,$f15,$f15 31975 694
swim 1558 12000c518 lda r6,12(r6)
swim 1560 12000c51c lds $f17,-8(r26) 1823 3
swim 1560 12000c520 lds $f18,-2060(r26) 1909
swim 1568 12000c524 adds $f13,$f16,$f13 1892 31
swim 1558 12000c528 lda r0,12(r0)
swim 1567 12000c52c adds $f24,$f22,$f24 3483 76
swim 1561 12000c530 lds $f21,-2056(r26) 1420
swim 1562 12000c534 muls $f0,$f15,$f15
swim 1566 12000c538 adds $f22,$f11,$f23
swim 1558 12000c53c lda r24,12(r24)
swim 1628 12000c540 subs $f12,$f16,$f12 1185
swim 1558 12000c544 lda r18,12(r18)
swim 1589 12000c548 lds $f28,-16(r26) 2198 1
swim 1571 12000c54c adds $f13,$f19,$f13 2176 2
swim 1570 12000c550 adds $f24,$f25,$f24 1311 1
swim 1598 12000c554 lds $f29,-12(r6) 2437 49
swim 1595 12000c558 adds $f26,$f27,$f27 6924 42
swim 1569 12000c55c adds $f23,$f19,$f23 1291 1
swim 1600 12000c560 lds $f5,-4(r6) 2259 11
swim 1574 12000c564 muls $f15,$f13,$f13
swim 1596 12000c568 adds $f3,$f26,$f26 3478 93
swim 1599 12000c56c lds $f15,-8(r6) 911
swim 1579 12000c570 lds $f6,-4(r21) 3204 24
swim 1560 12000c574 adds $f17,$f18,$f18 7132 55
swim 1631 12000c578 muls $f10,$f12,$f12
swim 1561 12000c57c adds $f20,$f21,$f21 5265 79
swim 1573 12000c580 adds $f24,$f30,$f24 1211
swim 1577 12000c584 lds $f30,-12(r21)
swim 1589 12000c588 adds $f14,$f28,$f28 1859 1
swim 1563 12000c58c muls $f0,$f18,$f18 508
swim 1598 12000c590 adds $f27,$f29,$f27 7143 89
swim 1564 12000c594 muls $f0,$f21,$f21 154
swim 1572 12000c598 adds $f23,$f25,$f23 1213
swim 1592 12000c59c muls $f0,$f28,$f28 1176
swim 1590 12000c5a0 adds $f17,$f14,$f14 1206
swim 1601 12000c5a4 adds $f27,$f15,$f27 3280 1
swim 1576 12000c5a8 muls $f21,$f24,$f21
swim 1580 12000c5ac lds $f24,-8(r8) 21
swim 1591 12000c5b0 adds $f20,$f17,$f17 1200 1
swim 1603 12000c5b4 lds $f20,0(r6)
swim 1575 12000c5b8 muls $f18,$f23,$f18
swim 1578 12000c5bc lds $f23,-8(r21)
swim 1597 12000c5c0 adds $f4,$f3,$f3 3571 76
swim 1582 12000c5c4 lds $f4,0(r8)
swim 1577 12000c5c8 adds $f30,$f13,$f13 8931 167
swim 1581 12000c5cc lds $f30,-4(r8)
swim 1604 12000c5d0 muls $f28,$f27,$f27 1296
swim 1580 12000c5d4 lds $f28,-12(r8)
swim 1599 12000c5d8 adds $f26,$f15,$f26
swim 1593 12000c5dc muls $f0,$f14,$f14 1210
swim 1600 12000c5e0 adds $f3,$f5,$f3 1339
swim 1594 12000c5e4 muls $f0,$f17,$f17
swim 1579 12000c5e8 adds $f6,$f21,$f6 10550 262
swim 1607 12000c5ec lds $f21,-12(r27) 64 1
swim 1602 12000c5f0 adds $f26,$f5,$f26 1297
swim 1619 12000c5f4 subs $f15,$f29,$f29 1178
swim 1603 12000c5f8 adds $f3,$f20,$f3 3984 39
swim 1578 12000c5fc adds $f23,$f18,$f18 8405 28
swim 1582 12000c600 subs $f4,$f30,$f4 6124 21
swim 1605 12000c604 muls $f14,$f26,$f14
swim 1610 12000c608 lds $f26,2040(r8) 38
swim 1580 12000c60c subs $f24,$f28,$f23 1928 8
swim 1606 12000c610 muls $f17,$f3,$f3 1251
swim 1608 12000c614 lds $f17,-8(r27) 26
swim 1581 12000c618 subs $f30,$f24,$f2
swim 1620 12000c61c subs $f5,$f15,$f15 1221
swim 1585 12000c620 muls $f1,$f4,$f4 1275
swim 1607 12000c624 subs $f21,$f27,$f21 7191 156
swim 1626 12000c628 lds $f27,-8(r0) 109 2
swim 1583 12000c62c muls $f1,$f23,$f23 1189
swim 1584 12000c630 muls $f1,$f2,$f2 1247
swim 1621 12000c634 subs $f20,$f5,$f5
swim 1622 12000c638 muls $f1,$f29,$f29 1147
swim 1629 12000c63c subs $f11,$f19,$f11
swim 1588 12000c640 subs $f6,$f4,$f4 1236
swim 1612 12000c644 lds $f6,2048(r8) 101 5
swim 1623 12000c648 muls $f1,$f15,$f15
swim 1586 12000c64c subs $f13,$f23,$f13 1154
swim 1611 12000c650 lds $f23,2044(r8) 1202
swim 1587 12000c654 subs $f18,$f2,$f2
swim 1609 12000c658 lds $f18,-4(r27) 38
swim 1624 12000c65c muls $f1,$f5,$f5
swim 1610 12000c660 subs $f26,$f28,$f26 11299 267
swim 1625 12000c664 lds $f28,-12(r0)
swim 1632 12000c668 muls $f10,$f11,$f11
swim 1608 12000c66c subs $f17,$f14,$f14 9522 165
swim 1627 12000c670 lds $f17,-4(r0) 1212
swim 1630 12000c674 subs $f22,$f25,$f22
swim 1626 12000c678 subs $f27,$f15,$f15 20211 424
swim 1586 12000c67c sts $f13,-12(r22) 101 2
swim 1613 12000c680 muls $f10,$f26,$f26 1169
swim 1587 12000c684 sts $f2,-8(r22) 53 3
swim 1588 12000c688 sts $f4,-4(r22) 1316 2
swim 1612 12000c68c subs $f6,$f30,$f6 8324 185
swim 1633 12000c690 muls $f10,$f22,$f22 1147
swim 1635 12000c694 subs $f15,$f11,$f11 1089
swim 1611 12000c698 subs $f23,$f24,$f23 5294
swim 1609 12000c69c subs $f18,$f3,$f3 6983 139
swim 1615 12000c6a0 muls $f10,$f6,$f6 1253
swim 1625 12000c6a4 subs $f28,$f29,$f28 3665
swim 1627 12000c6a8 subs $f17,$f5,$f5 10094 173
swim 1635 12000c6ac sts $f11,-8(r18) 88
swim 1614 12000c6b0 muls $f10,$f23,$f23 1352
swim 1616 12000c6b4 subs $f21,$f26,$f21
swim 1618 12000c6b8 subs $f3,$f6,$f3 1542
swim 1634 12000c6bc subs $f28,$f12,$f12 1149
swim 1617 12000c6c0 subs $f14,$f23,$f14 1996 1
swim 1616 12000c6c4 sts $f21,-12(r23) 124 4
swim 1636 12000c6c8 subs $f5,$f22,$f5 1188
swim 1618 12000c6cc sts $f3,-4(r23) 524 1
swim 1634 12000c6d0 sts $f12,-12(r18) 1248
swim 1617 12000c6d4 sts $f14,-8(r23) 2732 19
swim 1636 12000c6d8 sts $f5,-4(r18) 1810 14
swim 1558 12000c6dc bne r19,12000c4b4
Count the stores
%
cd e4%
grep sts pkcalc2__.dis | wc -l12
%
cd ../e5%
grep sts pkcalc2__.dis | wc -l3
The EV4 version is unrolled by 3x (plus a cleanup loop)
How about other events?
%
cd ..%
cat e4/bcache_miss.hot_routinesHot Routines for bcache_miss -pthresh 1
Events % Routine Image Addr
4536 39 pkcalc2__ a.out 12000C350:12000C8AF
3513 30 pkcalc3__ a.out 12000D430:12000D8AF
2883 25 pkcalc1__ a.out 12000B690:12000BAFF
155 1 calc3_ a.out 12000CDD0:12000D42F
%
cat e5/bcache_miss.hot_routinesHot Routines for bcache_miss -pthresh 1
Events % Routine Image Addr
4787 37 pkcalc2__ a.out 12000B790:12000B9DF
4640 35 pkcalc3__ a.out 12000C510:12000C93F
2778 21 pkcalc1__ a.out 12000B040:12000B22F
215 2 calc3_ a.out 12000BF00:12000C50F
%
cat >run_sca.cshset verbose
unlimit
set notify
setenv PARALLEL 4
iprobe -quiet -method sample scache_miss &
time ./a.out <swim.in >swim.out
kill %iprobe
unset verbose
%
csh%
cd e4%
source ../run_sca.cshunlimit
set notify
setenv PARALLEL 4
iprobe -quiet -method sample scache_miss &
[1] 3701
time ./a.out < swim.in > swim.out
Start of sampling
192.78u 0.23s 0:48 398% 0+153k 0+6io 0pf+0w
kill %iprobe
unset verbose
[1] Terminated iprobe -quiet -method sample scache_miss
%
cd ../e5%
source ../run_sca.cshunlimit
set notify
setenv PARALLEL 4
iprobe -quiet -method sample scache_miss &
[1] 3702
time ./a.out < swim.in > swim.out
Start of sampling
227.14u 0.18s 0:56 399% 0+153k 0+6io 0pf+0w
kill %iprobe
unset verbose
[1] Terminated iprobe -quiet -method sample scache_miss
The Dreaded Signal 1 Floorboard
%
../harness.pl -d pcsample.dat -e scache_missUser defined signal 1
%
iprobeNode name : gemosf.zko.dec.com
OS : OSF1 T4.0-738.5
CPU count : 4
Model : Unknown
Memory size : 381 MB
Counter count : 3
cycles : Low frequency
Current time : Tue Jun 17 07:17:04 1997
Start time: : immediate
Duration : 0 (until user interrupts)
Interval : 1
Method : count
Measured Modes : all modes
Measured Data : pid ctr ps pc
Buffer_count : 3
Buffer_size : 8192
time cpu freq event # events evts/sec
07:17:04 0 2^16 cycles 399966208 399966208
07:17:04 1 2^16 cycles 399966208 399966208
07:17:04 2 2^16 cycles 399966208 399966208
07:17:04 3 2^16 cycles 399769600 399769600
07:17:05 0 2^16 cycles 399572992 399572992
07:17:05 1 2^16 cycles 399507456 399507456
07:17:05 2 2^16 cycles 399507456 399507456
07:17:05 3 2^16 cycles 399572992 399572992
07:17:06 0 2^16 cycles 2818048 2818048
07:17:06 1 2^16 cycles 2883584 2883584
07:17:06 2 2^16 cycles 2883584 2883584
07:17:06 3 2^16 cycles 2818048 2818048
Total event count:
07:17:06 0 2^16 cycles 802357248
07:17:06 1 2^16 cycles 802357248
07:17:06 2 2^16 cycles 802357248
07:17:06 3 2^16 cycles 802160640
Reduce the scache miss events
%
../harness.pl -d pcsample.dat -e scache_missOs=unix
Generating top-level report for scache_miss
ipreduce -input_file pcsample.dat -output_file scache_miss.rpt -event
scache_miss -pthresh 1
ipreduce -o scache_miss_pkcalc3__.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc 12000C510:12000C93F
Annotating pkcalc3__
ipreduce -o scache_miss_pkcalc2__.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc 12000B790:12000B9DF
Annotating pkcalc2__
ipreduce -o scache_miss_pkcalc1__.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc 12000B040:12000B22F
Annotating pkcalc1__
ipreduce -o scache_miss_hardclock.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc FFFFFC0000253490:FFFFFC00002540EF
dis -h -p hardclock /vmunix > hardclock.dis_tmp
Annotating hardclock
ipreduce -o scache_miss_spin_wait_join_barrier.rpd -d pc -event scache_miss
-input_file pcsample.dat -pc 1200B5118:1200B525F
dis -h -p spin_wait_join_barrier a.out > spin_wait_join_barrier.dis_tmp
Annotating spin_wait_join_barrier
%
cd ../e4%
../harness.pl -d pcsample.dat -e scache_missOs=unix
Generating top-level report for scache_miss
ipreduce -input_file pcsample.dat -output_file scache_miss.rpt -event
scache_miss -pthresh 1
ipreduce -o scache_miss_pkcalc3__.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc 12000D430:12000D8AF
Annotating pkcalc3__
ipreduce -o scache_miss_pkcalc2__.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc 12000C350:12000C8AF
Annotating pkcalc2__
ipreduce -o scache_miss_pkcalc1__.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc 12000B690:12000BAFF
Annotating pkcalc1__
ipreduce -o scache_miss_hardclock.rpd -d pc -event scache_miss -input_file
pcsample.dat -pc FFFFFC0000253490:FFFFFC00002540EF
dis -h -p hardclock /vmunix > hardclock.dis_tmp
Annotating hardclock
ipreduce -o scache_miss_spin_wait_join_barrier.rpd -d pc -event scache_miss
-input_file pcsample.dat -pc 1200B6088:1200B61CF
dis -h -p spin_wait_join_barrier a.out > spin_wait_join_barrier.dis_tmp
Annotating spin_wait_join_barrier
%
cd ..%
cat e4/scache_miss.hot_routinesHot Routines for scache_miss -pthresh 1
Events % Routine Image Addr
17489 36 pkcalc3__ a.out 12000D430:12000D8AF
16835 35 pkcalc2__ a.out 12000C350:12000C8AF
11260 23 pkcalc1__ a.out 12000B690:12000BAFF
668 1 hardclock /vmunix FFFFFC0000253490:FFFFFC00002540EF
546 1 spin_wait_join_barrier a.out 1200B6088:1200B61CF
%
!!:s/e4/e5/% cat e5/scache_miss.hot_routines
Hot Routines for scache_miss -pthresh 1
Events % Routine Image Addr
19570 37 pkcalc3__ a.out 12000C510:12000C93F
18734 35 pkcalc2__ a.out 12000B790:12000B9DF
11570 22 pkcalc1__ a.out 12000B040:12000B22F
697 1 hardclock /vmunix FFFFFC0000253490:FFFFFC00002540EF
549 1 spin_wait_join_barrier a.out 1200B5118:1200B525F
%
cd e5% cat pkcalc2__.dis
Cycle=cycles
BMis=bcache_miss
SMis=scache_miss
pkcalc2__:
file line addr Instr Cycle BMis SMis
...
swim 985 12000b8d8 lds $f13,2052(r18) 7251 4 32
swim 985 12000b8dc lds $f12,2056(r18) 66 1 8
swim 988 12000b8e0 lds $f15,2052(r6) 3914 17
swim 988 12000b8e4 lds $f14,2056(r6) 60 2
swim 984 12000b8e8 lds $f11,0(r19) 3687 215
swim 983 12000b8ec addl r16,1,r16
swim 984 12000b8f0 lds $f18,-2052(r19) 3590 14
swim 983 12000b8f4 cmple r16,r1,r24
swim 983 12000b8f8 lda r23,4(r23) 3477 12
swim 983 12000b8fc lda r26,4(r26)
swim 983 12000b900 lda r19,4(r19) 3765 951
swim 983 12000b904 lda r22,4(r22)
swim 983 12000b908 lda r8,4(r8) 3634 120
swim 983 12000b90c lda r6,4(r6)
swim 983 12000b910 lda r18,4(r18) 3965 163
swim 983 12000b914 lda r17,4(r17)
swim 987 12000b918 lds $f19,-8(r19) 7330 32 926
swim 987 12000b91c lda r21,4(r21)
swim 985 12000b920 adds $f12,$f13,$f12 68442 1205 2109
swim 983 12000b924 lda r27,4(r27)
script done on Tue Jun 17 07:19:18 1997
Summary
Comments on IPROBE may be sent to goddard@zko.dec.com
Comments on the data reduction harness may be sent to henning@zko.dec.com
Digital-internal users can find IPROBE via the CSD/PG web site http://sdtad.zko.dec.com/pub/csdpg and the data reduction harnes via http://tlg-www.zko.dec.com/~henning
External users should contact Greg Tarsa, tarsa@zko.dec.com, to inquire about licensing.