This chapter describes SpeedShop commands for exploring memory usage and paging, and for printing data files generated by SpeedShop tools. It contains the following sections:
The thrash command allows you to explore paging behavior by allocating a region of virtual memory, and either randomly or sequentially accessing that memory to explore the system paging behavior.
thrash [args] |
args | One or more of the following flags:
|
Once the memory is allocated, thrash prints a message on stdout saying how much memory it is using and then proceeds to thrash over it. Here's an example:
fraser 82% thrash -m 4 thrashing randomly: 4.00 MB (= 0x00400000 = 4194304 bytes = 1024 pages) 10000 iterations |
You can use thrash in conjunction with ssusage and squeeze to determine the approximate available working memory on a system, as described in the section "Calculating the Working Set of a Program".
The squeeze command allows you to specify an amount of virtual memory to lock down into real memory, thus making it unavailable to other processes. This command can only be used only by superuser.
squeeze [flag] amount |
flag | One of the following flags. If no flag is specified, the default is megabytes.
| |||||||||
amount | The amount of memory to be locked. |
squeeze performs the following operations:
Locks down the amount of virtual memory you supply as an argument to the command.
Prints a message to stdout that provide information on how much memory has been locked, and how much working memory is available.
Sleeps indefinitely, or until interrupted by SIGINT or SIGTERM. At that time, it frees up the memory and exits with an exit message.
Wait until after the exit message is printed before doing any experiments.
Here's an example:
fraser 1# squeeze 4 squeeze: leaving 60.00 MB ( = 0x03c01000 = 62918656 ) available memory; pinned 4.00 MB ( = 0x00400000 = 4194304 ) at address 0x1000e000; from 64.00 MB ( = 0x04001000 = 67112960 ) installed memory. |
Use Ctrl-C to exit squeeze. The following message is printed:
squeeze exiting |
You can use the thrash, squeeze, and ssusage commands together to determine the approximate working set of a program as follows. For all practical purposes, the working set of your program is the size of memory allocated.
The process involves three steps. First you determine the working set of the kernel and other applications:
Choose a machine that has a large amount of physical memory (enough to allow your target application to run without any paging other than at start-up).
Make sure that the machine is running a minimal number of applications that will remain fairly consistent for the duration of these steps.
Run thrash with ssusage to determine the working set of the kernel and any other applications you have running.
In this example, the thrash command uses 4 MB of memory:
ssusage thrash -m 4 |
When the thrash command completes, ssusage prints the resource usage of thrash; the value labelled majf gives the number of major page faults (i.e. the number of faults that required a physical read.) When you run on a machine with a large amount of physical memory, this value is the number of faults needed to start the program, which is the minimum number for any run. For more information on ssusage, see Chapter 5, "Collecting Data on Machine Resource Usage."
As superuser in a separate window, run the squeeze command to lock down an amount of memory.
ssusage thrash -m 4 |
Repeat steps 1 and 2, increasing the amount of memory for squeeze, until the majf number begins to rise.
The amount of working memory available reported by squeeze at the point at which page faults begin to rise for thrash tells you the combined working set of thrash (approximately 4 MB), the kernel and any other applications you have running.
Deduct the 4 MB that thrash uses from the amount of working memory reported by squeeze at the point the page faults began to rise.
This computation helps you find out the approximate working set of the kernel and any other applications that are running on the machine. You'll need this number when you reach the next steps.
Determine the working set of the program you're interested in. Make sure the applications that the machine is running remain consistent with the setup from step 2.
Run ssusage with your program to ensure that the machine has the amount of memory your program needs.
ssusage prog_name |
When your program exits, ssusage prints the application's resource usage: the majf field gives the number of major page faults. When run on a machine with a large amount of physical memory, this value is the number of faults needed to start the program, which is the minimum number for any run.
Switch to superuser.
Run squeeze to lock down an amount of memory. The following example locks down 15 megabytes of memory:
squeeze 15 |
Repeat steps 11 and 12 until the majf number begins to rise.
Deduct the amount squeezed at the point at which the application begins to page fault from the total amount of physical memory in the system.
This computation determines the combined working set of your program, the kernel and any other applications you have running.
Deduct the amount of working memory calculated in step 7 from the total amount of physical memory in the system.
This computation determines the approximate working set of your program.
All the performance data for a single process is in one file. The file begins with a prologue and continues with a mixture of performance data, sample records, and control records.
The ssdump command can be used for printing performance data files. It provides a formatted ASCII dump of one or more performance experiment data files. This command is most likely to be useful in verifying performance data that does not seem accurate when reported through prof.
ssdump [options] {datafile1 ... datafileN} ... |
options | Zero or more of the following print options:
|
The file is written as a string of "beads," each of which is a record with
a 32-bit type
a 32-bit byte count
a body whose length is given by the byte-count, rounded up to a double-word boundary
The file prologue consists of these beads:
file-identifier bead, which acts as a magic number, indicating that the file is a SpeedShop data file
machine and executable name
hardware inventory describing the machine
machine page size
O/S revision, date, and checksum information about the executable
target name (the target is the executable after instrumentation)
arguments with which the target was invoked
instrumentation performed
types of performance data that are to be recorded in the remainder of the file
The following example calls ssdump on performance data for a pcsamp experiment:
ssdump generic.pcsamp.m847 |
Below is some partial output from ssdump. The format has been adjusted slightly to meet presentation needs.
Printing experiment record file "generic.pcsamp.m847" (2688 bytes), last written on Tue 15 Apr 1997 15:27:02 SpeedShop File Preface 1, offset 0 = 0x00000000 (size 32) file type 1 (SSRUN); version 4 process control flags: 0xd _SPEEDSHOP_TRACE_FORK=True _SPEEDSHOP_TRACE_FORK_TO_EXEC=False _SPEEDSHOP_TRACE_SPROC=True _SPEEDSHOP_TRACE_EXEC=True _SPEEDSHOP_TRACE_SYSTEM=False ancestor exp file name: created: Tue 15 Apr 1997 15:26:10.719 Hardware Inventory 2, offset 40 = 0x00000028 (size 280) hardware inventory: 17 items class 1, type 1, contrlr 100, unit 255, state 12 class 1, type 3, contrlr 0, unit 0, state 8192 class 1, type 2, contrlr 0, unit 0, state 8208 class 4, type 8, contrlr 0, unit 0, state 2 class 5, type 5, contrlr 0, unit 0, state 1 class 3, type 3, contrlr 0, unit 0, state 16384 class 3, type 4, contrlr 0, unit 0, state 16384 class 3, type 9, contrlr 0, unit 0, state 64 class 3, type 1, contrlr 0, unit 0, state 67108864 class 12, type 3, contrlr 0, unit 0, state 16 class 8, type 7, contrlr 17, unit 0, state 16777472 class 10, type 3, contrlr 0, unit 0, state 16400 class 8, type 0, contrlr 0, unit 0, state 1 class 2, type 1, contrlr 0, unit 13, state 2 class 2, type 2, contrlr 0, unit 2, state 0 class 2, type 2, contrlr 0, unit 1, state 0 class 7, type 14, contrlr 0, unit 0, state 0 Experiment name 3, offset 328 = 0x00000148 (size 8) pcsamp Experiment marching orders 4, offset 344 = 0x00000158 (size 16) pc,2,10000,0:cu Capture module symbol 5, offset 368 = 0x00000170 (size 16) pc,2,10000,0 Capture module symbol 6, offset 392 = 0x00000188 (size 8) cu Executable file 7, offset 408 = 0x00000198 (size 8) generic Target file 8, offset 424 = 0x000001a8 (size 8) generic Target arguments 9, offset 440 = 0x000001b8 (size 32) Time: Tue 15 Apr 1997 15:26:10.719, process pid = 847 arguments: "" Target begin 10, offset 480 = 0x000001e0 (size 40) process # -1, pid = 847, event # 0 event type = 0,0 at time = Tue 15 Apr 1997 15:26:10.719 Program Object List 11, offset 528 = 0x00000210 (size 312) process # -1, pid = 847, event # 0, -- 5 DSOs Program Object 0, Named `generic' Link Time Address: 0x0000000010000000 Run Time Address: 0x0000000010000000 Size: 0x0000000000007000 (28672) Base Pointer: 0x0000000000000000 Program Object 1, Named `/usr/lib32/libss.so' Link Time Address: 0x0000000009e50000 Run Time Address: 0x0000000009e50000 Size: 0x0000000000002000 (8192) Base Pointer: 0x0000000000000000 Program Object 2, Named `/usr/lib32/libssrt.so' Link Time Address: 0x0000000009da0000 Run Time Address: 0x0000000009da0000 Size: 0x000000000008b000 (569344) Base Pointer: 0x0000000000000000 Program Object 3, Named `/usr/lib32/libm.so' Link Time Address: 0x000000000f840000 Run Time Address: 0x000000000f840000 Size: 0x0000000000028000 (163840) Base Pointer: 0x0000000000000000 Program Object 4, Named `/usr/lib32/libc.so.1' Link Time Address: 0x000000000fa00000 Run Time Address: 0x000000000fa00000 Size: 0x0000000000108000 (1081344) Base Pointer: 0x0000000000000000 Target DSO open 12, offset 848 = 0x00000350 (size 56) process # -1, pid = 847, event # 0 at time = Tue 15 Apr 1997 15:27:00.716 fname = ./dlslave.so Program Object List 13, offset 912 = 0x00000390 (size 360) process # -1, pid = 847, event # 0, -- 6 DSOs Program Object 0, Named `generic' Link Time Address: 0x0000000010000000 Run Time Address: 0x0000000010000000 Size: 0x0000000000007000 (28672) Base Pointer: 0x0000000000000000 Program Object 1, Named `/usr/lib32/libss.so' Link Time Address: 0x0000000009e50000 Run Time Address: 0x0000000009e50000 Size: 0x0000000000002000 (8192) Base Pointer: 0x0000000000000000 Program Object 2, Named `/usr/lib32/libssrt.so' Link Time Address: 0x0000000009da0000 Run Time Address: 0x0000000009da0000 Size: 0x000000000008b000 (569344) Base Pointer: 0x0000000000000000 Program Object 3, Named `/usr/lib32/libm.so' Link Time Address: 0x000000000f840000 Run Time Address: 0x000000000f840000 Size: 0x0000000000028000 (163840) Base Pointer: 0x0000000000000000 Program Object 4, Named `/usr/lib32/libc.so.1' Link Time Address: 0x000000000fa00000 Run Time Address: 0x000000000fa00000 Size: 0x0000000000108000 (1081344) Base Pointer: 0x0000000000000000 Program Object 5, Named `./dlslave.so' Link Time Address: 0x000000005ffe0000 Run Time Address: 0x000000005ffe0000 Size: 0x0000000000001000 (4096) Base Pointer: 0x0000000000000000 Sample event trigger 14, offset 1280 = 0x00000500 (size 40) process # -1, trap index # -1 at time = Tue 15 Apr 1997 15:27:01.989, #-1 Compressed PC sampling array (16-bit) 15, offset 1328 = 0x00000530 (size 320) compressed short array, dso index = 0, array size = 7168, 156 compressed Compressed PC sampling array (16-bit) 16, offset 1656 = 0x00000678 (size 16) compressed short array, dso index = 1, array size = 2048, 4 compressed Compressed PC sampling array (16-bit) 17, offset 1680 = 0x00000690 (size 40) compressed short array, dso index = 2, array size = 142336, 16 compressed Compressed PC sampling array (16-bit) 18, offset 1728 = 0x000006c0 (size 16) compressed short array, dso index = 3, array size = 40960, 4 compressed Compressed PC sampling array (16-bit) 19, offset 1752 = 0x000006d8 (size 64) compressed short array, dso index = 4, array size = 270336, 28 compressed Compressed PC sampling array (16-bit) 20, offset 1824 = 0x00000720 (size 48) compressed short array, dso index = 5, array size = 1024, 20 compressed PC sampling array (16-bit) 21, offset 1880 = 0x00000758 (size 16) short array, dso index = -1, array size = 1 Resource usage 22, offset 1904 = 0x00000770 (size 680) Sample data end marker 23, offset 2592 = 0x00000a20 (size 40) Target termination 24, offset 2640 = 0x00000a50 (size 40) process # -1, pid = 847, event # 0 event type = 0,0 (normal termination, exit status 0) at time = Tue 15 Apr 1997 15:27:02.231 ** End-of-File 25, offset 2688 = 0x00000a80 (size 0) **** End of experiment record file "generic.pcsamp.m847" |
The fbdump command can be used to print out the compiler feedback files generated by running prof -feedback. For more information on using compiler feedback files, view the cord or cc reference pages.
fbdump options filename |
options | Zero or more of the options described in table Table 9-1. | |||||||||||||||||||||
filename | The feedback filename. This file has a .fb extension.
|