KCRASH PART 1 :
A User's Guide to Dump Analysis

Appendices

SECTION 2

Pre-analysis setup:

Prior to starting a kcrash session, there are a few more steps that should be taken.

Memory Image and Symbol File:

Before running kcrash, check the size of the crash file using ls -l. The size should be equal to the system memory size. If the system has 128 megabytes of memory and the crash dump file has a size of only 64 megabytes, the image will not be complete. You may or may not be able to obtain information from the crash dump. To analyze a dump, you also must have a symbol file. Usually, the sym.mmdd file is the best, but you may use '/unix' or '/stand/unix' from the same system that the dump was taken. If for some reason, you do not have a symbol file from the same system that the dump was taken from, try using the '/unix' on your system. You may be able to at least read the console buffer, but you most likely will be unable to use the dump for a complete analysis.

Macro Load Script:

The kcrash macros provide critical command extensions to get useful information from kcrash. There are quite a few macro files and loading them individually requires a lot of time and keystrokes. To simplify things, create a script in /crash/macros to load all of your favorite macros. Then, you can issue one command in kcrash and it will load all the macros. For example, create a file called /crash/macros/loadmacs with the contents:

< /crash/macros/trace.k < /crash/macros/stat.k < /crash/macros/stream.k < /crash/macros/info.k < /crash/macros/proc.k < /crash/macros/user.k < /crash/macros/vm.k < /crash/macros/vnode.k

This list contains all the files necessary to run the examples discussed in this section. You may want to create a file that contains all of the '.k' files in /crash/macros.

Command, Macro and Image Revision:

You should use the kcrash executable and macros from either 1.3.1 or the same operating system version as the crash dump was taken on. The 1.2 kcrash and macros will work on 1.2 crash dump, 1.3 kcrash may be used on 1.2 and 1.3 crash dumps, 1.3.1 kcrash may be used on 1.3.1 as well as 1.3 and 1.2.

Get the best kcrash:

See Appendix C for the latest kcrash macros.

Starting the Analysis:

First, it is easiest to analyze a crash dump if you have either an X terminal or some other X Window package. By using multiple windows you can save useful information like the process and putbuf listings to a file in one window and utilize that information in the kcrash window.

The command line for kcrash is:

kcrash [-w] [-k] < crash image > < symbol file >

The -w option allows writing to the crash image fine and is not generally useful. The -k option allows use of kcrash on a live system where the crash image becomes '/dev/mem'. This option can be very useful when using kcrash for performance analysis. Unlike kdb, kcrash does not interrupt system operation. On a running system the normal kcrash command line would be:

# kcrash -k /dev/mem /unix

On a live system, all information is dynamic so not everything you can do on a dump image will work.

To run kcrash on a dump file, use the command:

# kcrash crash.mmdd sym.mmdd

Always specify a symbol file on the kcrash command line. The default symbol file is /stand/unix, and unpacking symbols from the bfs file system takes a lot of time.

For more information on the kcrash command, see Appendix B.

Setting kcrash Environment:

Once you have started kcrash using:

# kcrash crash.mmdd sym.mmdd

You will see :

UNISYS Crash Dump Analyzer Release 4 Version 4.00 (C) Copyright 1988,1994 UNISYS. All rights reserved. Time: a b d HH:MM:SS yyyy dumpfile = crash.mmdd namelist = sym.mmdd Physbase: 00000000 Dirbase: 00002000 Warning: registers are inaccessible; use "rg" S>

At this point, you should set up the kcrash environment using:

S> rg panicregs S> < /crash/macros/loadmacs S> more 14

The rg command is used to initialize the registers for kcrash. Without a valid set of register values loaded, some of the kcrash commands cannot provide correct information. During panic processing, the registers for each processor are saved at panicregs[CPUID]. Each register set is size 0x6C, so cpu 0 registers are at panicregs, cpu 1 at panicregs + 0x6C, and so on.

The next step loads the macros you chose to put in your loadmacs script.

The more command turns on pagination. A value of 14 enables a 20 line screen a value of 30 enables a 48 line screen. To turn paging off, use 'more 0'.

Some macros generate displays that can go on for thousands of lines, notably "queues" and "streams". Usually, your intr character (^C) can interrupt these long displays. Interrupt is not effective while paused for pagination, it only works when the macro is actively displaying information. Sometimes, it may take multiple attempts to stop the macro. If 'intr' does not work, you will have to resort to using the kill character (^| <- the pipe signal). If you must use kill, kcrash will be aborted and your screen may be in raw mode. If so, enter the command 'stty sane' to restore normal operation.

Basic Macros:

Dump information

First, let's determine the release that the dump image is from. We can do this by using the dumpinfo macro which is defined in /crash/macros/stat.k.

  S> dumpinfo
  release: 4.0 
  version: 2
  take:    18.7-7
  System name: Unisys
  Release 4 Version 4.00
  Physbase: 00000000
  dumpfile = crash.0911
  namelist = sym.0911
  panicstr:       Unrecoverable NMI in %s mode at EIP=0x%x

  D17AA800 05229 05202 01086 00102010 - - 0          ONPROC et486.224 
  D17A3C00 05196 04803 00000 00102010 - - 1          ONPROC in.rlogind

The take display shows 18.7-bb. bb is the kernel build count. Each time an idbuild is performed, the build count is incremented by one. On a newly installed system, the build count is 1 because the kernel is rebuilt once during installation. This system has had 6 idbuilds performed since the system was installed. During the development cycles, internal revision levels are identified as "takes". Take numbers are not directly releated to release revision levels, but each release does have a characteristic take number.

 UNIX Revision     Take number
     1.2              11.1
     1.3              18.7
     1.3.1             4.0

If the system panicked, the panic string will be shown in the panicstr. Any processes that were "ONPROC" at the time of the system dump will be shown.

This dump shows that it was from a 1.3 release and 6 idbuilds have been performed since its initial installation.

Viewing the console buffer:

Next, let's look at the console buffer for any status messages or error messages. We can do this by using the putbuf macro:

  S>putbuf
  Putbuf Data
  Total Real Memory        = 99872768 bytes
  Total Available Memory   = 94576640 bytes

The putbuf shows the memory status which gives us another way of checking the size of the crash.mmdd file and a way to verify how much memory the system is recognizing.

The next messages show the cpu and bus type, and copyright and boot information like the system does during boot:

  Pentium(TM) cpu 0, EISA bus

  UNIX System Laboratories, Inc. (USL)
  UNIX(TM) System V Release 4 Multi-Processor Version 2 Take 7.7-9

  Copyright (c) 1984, 1986, 1987, 1988, 1989, 1990 AT&T
  Copyright (c) 1990, 1991, 1992 UNIX System Laboratories, Inc.
  Copyright (c) 1987, 1988, 1989, 1990, 1991, 1992, 1993 Unisys Corp.
  Copyright (c) 1987, 1988 Microsoft Corp.
  Copyright (c) 1991 Intel Corp.
  All Rights Reserved

  adhost ctlrno=0 slotno=2 intvec=11 scsi-id=7 found
  aic70  ctlrno=0 slotno=9 intvec=10 scsi-id=7 found
  UNB EISA Ethernet unit 0 initialized

Note that this is a good way to find out how many processors are recognized by the system.

  Found 2 processor(s)

  CBTL=ControllerNo:BusNo:TargetID:LUN
  FBS =Fix Block Size
  VBS =Variable Block Size

  ucsd0 on aic70 CBTL=0:0:0:0 is 500 MB "UNISYS  U0531 ST3600N   8374" DISK
  ucsd1 on aic70 CBTL=0:0:1:0 is 1033 MB "UNISYS  U0805 M2694ES-51931F" DISK
  ucst0 on aic70 CBTL=0:0:3:0 is "ARCHIVE VIPER 150  21247-014"  FBS TAPE
  ucst1 on aic70 CBTL=0:0:4:0 is "UNISYS  M1017UD-4MM     316H"  FBS TAPE
  ucsd63 on aic70 CBTL=0:0:5:0 is 0 MB "UNISYS  CD-ROM          0005" RODISK
  ucsd2 on adhost CBTL=0:0:0:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK
  ucsd3 on adhost CBTL=0:0:1:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK
  ucsd4 on adhost CBTL=0:0:2:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK
  ucsd5 on adhost CBTL=0:0:3:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK
  ucsd6 on adhost CBTL=0:0:4:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK
  ucsd7 on adhost CBTL=0:0:5:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK
  ucsd8 on adhost CBTL=0:0:6:0 is 2040 MB "UNISYS  U1545 ST12550N  2810" DISK

After all the boot messages, you should see why the system panicked.

  NOTICE: ufs alloc: /data/db4 over 90% full.
          Only super user can allocate more space.

The above message indicates a potential cause of the problem. By examining the next message, we can determine that this is a known problem. It is fixed by patch 16241954 and then running fsck on the ufs filesystems with errors. In this case, /data/db4.

  PANIC: free: freeing free block,dev = 0x000000C1, block = 3132, fs = /data/db4

Usually, when the system panics, the putbuf information will show which processor panicked and how many pages it is trying to dump.

  PANIC: Processor 00000001 panicked
  Trying to dump 24383 Pages
  ........................................

Let's look at another putbuf example. This example shows that the putbuf buffer is circular and if messages are scrolling on the console, putbuf may be over written by them.

  S> putbuf
  Putbuf Data
  t=014E2DA5

  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbolt=014E2DA5

  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbolt=014E2DA5

  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbolt=014E2DA5

  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbolt=014E2DA5

... several of the above messages deleted.

  PANIC: (NMI): Unrecoverable NMI

  Trying to dump 131071 Pages
  00004 0000000B 0000000A 0000674D 0000000
  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbolt=014E2DA5
  0 0000000A 00000002 00000000
  NFS write failed for server hu6000: RPC: Timed out
  NFS write error 145 on host hu6000 fh 00000004 0000000B 0000000A 0000674D 000000
0
  0 0000000A 00000002 00000000

  WARNING: nfs_s>
  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbolt=014E2DA5

  NOTICE: (NMI): Processor 0x00000002 didnot come to sync point
  trategy: biod daemon not running, SSYS process going to sleep

  NOTICE: (NMI): EISANMI detected on SID hand=00000002 lbol

In this case, the putbuf buffer was overwritten by many NOTICE messages and we can not determine a lot from the information it was able to save.

In the next case, we see the same problem with scrolling WARNINGs as we did with the above NOTICE messages, but putbuf was able to record some other useful information:

  S>putbuf
  Putbuf Data
  ut table overflow

  WARNING: Timeout table overflow

  WARNING: Timeout table overflow

  WARNING: Timeout table overflow

.... several more WARNING messages deleted

  WARNING: Timeout table overflow
  PANIC:
  cr0 0x8000001B     cr2  0x6F633A64     cr3 0x00002000     tlb  0xFFFFF801
  ss  0x00000008     uesp 0xD021B2BC     efl 0x00010093     ipl  0x00000008
  cs  0x00000158     eip  0xD010AEFB     err 0xD0210000     trap 0x0000000E
  eax 0x6F633A64     ecx  0xD1145240     edx 0xD1528C80     ebx  0xD02167E0
  esp 0xE0004D20     ebp  0xE0004D3C     esi 0xD117D94C     edi  0xD1528C80
  ds  0x00000160     es   0x00000160     fs  0x00000000     gs   0x00000000

  PANIC: Kernel mode trap. Type 0x0000000E
  Trying to dump 15760 Pages
  ......................................RNING: Timeout table overflow

  WARNING: Timeout table overflow

... more of the "overflow" messages deleted

  WARNING: ldtermrsrv: out of blocks

  WARNING: ldtermrsrv: out of blocks

  WARNING: Timeout table overflow

... more of the "overflow" messages deleted

But, we see the warning ldtermrsrv: out of blocks as well as the entire panic message. These are very helpful in our analysis. According to the Error Messages Guide, the ldtermrsrv message means that "A request for STREAMS by the terminal line discipline failed because memory was overcommitted". It recommends to check the value of the kernel parameters STRTHRESH and MAXDMAPAGE. STRTHRESH is usually set to between 1/8th and 1/4th system memory. MAXDMAPAGE should be set to 0 (zero). But, we will look at other macros that will help us verify whether or not these values are incorrect.

Looking at streams and memory usage:

To verify if we are running out of streams memory, we can use the 'strstat' macro. The output looks like this:

  S> strstat
  [D0213118] [D022D5C4]
            inuse   (Strinfo)  total    max      fail
  stream    00000149(00000149) 000005FE 00000152 00000000
  queue     00000744(00000744) 000023E6 00000776 00000000
  msgblock  000009C4(00000A63) 00116A75 00000A63 00000000
  mdbblock  00001734(00001738) 008C8CB5 00001738 00000000
  linkblk   00000017(00000017) 00000017 00000017 00000000
  strevent  00000008(0000000A) 0000000C 0000000A 00000000
  Total byte count = 009A67A0
  strthresh = 00792200

This shows that the total byte count of streams waiting for processing is around 10M (0x009A67A0) and that STRTHRESH is less than 8M (0x00792200). So, it may be that STRTHRESH does need to be increased. But, there are no failures noted. It may be necessary to further analyze the streams that need processing. That type of analysis is discussed in KCRASH PART 2 of this document.

To see if MAXDMAPAGE is set to 0, we use the 'tune' macro.

  S> tune
  t_gpgslo       0000002A
  t_gpgshi       00000000
  t_ageinterval  00000000
  t_fsflushr     00000001
  t_minarmem     00000034
  t_minasmem     00000023
  t_dmalimit     00001000
  t_flckrec      0000012C
  t_minakmem     00000010

MAXDMAPAGE is denoted by the value of t_dmalimit. If the dump you are analyzing is from a non-U6000/500 Model 50 or U6000/500 Model 80, then MAXDMAPAGE is set to 4096 (t_dmalimit 0x1000) which was the default prior to 1.2. In 1.2 and above, the default is 0 (zero).

If you have a U6000/500 Model 50 or Model 80 in 1.3 or 1.3.1, the tune macro will display :

 S> tune
  t_gpgslo       00000019
  t_gpgshi       00000000
  t_ageinterval  00000000
  t_fsflushr     00000001
  t_minarmem     00000023
  t_minasmem     00000023
  t_dmalimit     00040000
  t_flckrec      0000012C
  t_minakmem     00000010

tune.t_dmalimit is based off of Physicbase. Since Physicbase starts at 1GB (0x40000000) on the Model 50/80, a value of 40000 is actually 0 (in clicks).

Another way to check whether MAXDMAPAGE is set to non-zero is to use the command :

  S> dl dma_check_on
  dma_check_on:  00000001 00000000 00000000 00000000  ................

If this returns a 1, then MAXDMAPGE is non-zero, if it returns a zero, then MAXDMAPAGE is set to 0. In this case, it is on. If MAXDMAPAGE is non-zero, you can determine whether or not there is still dma-able memory available by using:

  S> dl dma_freemem
  dma_freemem:  00000000 00000000 D01D827C 00000000  ........|.......  .

In this case, there is no dma-able memory left. To further check the status of memory, there are two more macros that should be used. First, see if there are any kernel memory failures:

  S> kmeminfo
  km_mem[0]    00139400       <- Small buffer pool size
  km_mem[1]    006C8000       <- Large buffer pool size
  km_mem[2]    00000000       <- Always zero as oversize buffers have no pool
  km_alloc[0]  00134DF0       <- Number in use from small buffer pool
  km_alloc[1]  006C1E00       <- Number in use from large buffer pool
  km_alloc[2]  00183000       <- Number of oversize buffers in use
  km_fail[0]   00000ADB       <- allocation failures
  km_fail[1]   00000037
  km_fail[2]   00000000

The amount of kernel memory in use is the sum of km_mem[0], km_mem[1] and km_alloc[2]. In this case, that is equal to 0x98440 or about 9.9MB. There are several failures shown in km_fail[0] and km_fail[1] which should always be noted.

Now, run the strstat macro to see if there are any failures:

  S>strstat
  [D020C898] [D0226CF8]
            inuse   (Strinfo)  total    max      fail
  stream    00000130(00000130) 00000A95 00000147 00000000
  queue     000006E2(000006E2) 00003014 00000760 00000000
  msgblock  0000049A(000005E3) 00333C83 000005E3 00000000
  mdbblock  00000AC4(00000AC4) 017C2D19 00000AC4 0000086F
  linkblk   00000017(00000017) 00000017 00000017 00000000
  strevent  00000010(00000013) 0000045D 0000001D 00000129
  Total byte count = 0041ED3D
  strthresh = 00792200

This shows that there are failures. In this case, the system is hung because it was out of kernel memory buffers which are the only memory used for streams. Since MAXDMAPAGE is set to 4096 (16MB), all kernel memory buffers must fit in the first 16MB of memory. Although kernel memory can be exhausted because of legitimate processing needs on a large system, or because of an error that causes excessive memory usage, it is advisable to change MAXDMAPAGE to zero using:

# /etc/conf/bin/idtune MAXDMAPAGE 0 #/etc/conf/bin/idbuild The UNIX Operating System will now be rebuilt. This will take some time. Please wait. Root for this system build is / # cd / # shutdown -i6 -g0 -y

If this does not resolve the problem, further study will have to be performed on the next dump. For more information regarding dma-able memory, see Appendix D.

To see if the system is out of memory, use:

  S> dl freemem 
  freemem:  00000C18 00000000 00000022 00000000  ........".......  .

In this case, there are 3096 pages of memory available. (about 12MB).

Viewing the process table:

To view the process table, we use the macro 'ps' which is very similar to the ps command. (This is a sample of a process table)

 S> ps
 ADDRESS  PID   PPID  UID   FLAGS    K U R WCHAN    ST  COMMAND
 D1194C00 02889 02655 00000 00502010 - - - D025FA28 SLEEP /home/opt/SPO/splog/bi
 D141BE00 02861 02655 00108 00502018 - - - D01D874C SLEEP /home/opt/SPO/splog/bi
 D1196C00 02828 02655 00000 00502010 - - -          RUN /home/opt/SPO/pcam/bin/p
 D1196A00 02740 02655 00000 00502018 - - -          RUN /home/opt/SPO/opdesk/bin
 D13F7200 02680 00001 00000 01402010 - - - D025FA28 SLEEP /usr/X/bin/xdm
 D1565A00 25417 25405 00106 00502010 - - 0          ONPROC ./spo
 D13FE000 02679 02655 00000 00102010 - - -          RUN /home/opt/SPO/opdesk/bin
 D1162800 02525 00001 00000 02102018 0 - - D1161FC0 SLEEP /mls/bin/satsave 00001
 D1163C00 02479 00001 00000 00002010 0 - -          RUN /usr/lib/errdemon
 D1268800 00005 00000 00000 00002039 - - -          RUN kmdaemon
 D1268A00 00004 00000 00000 00002039 - - - D00523E0 SLEEP fsflush
 D1164800 00003 00000 00000 00002039 - - - D1164800 SLEEP pageout
 D1164A00 00002 00000 00000 00002031 0 - -          RUN idleproc00

To find out what the wait channel for a process is (i.e. what the process is sleeping on), you use:

 S> di D025FA28
 pollwait:  00 00                            addb   %al,(%eax)  .

The K U and R fields are the kernel binding, user binding and [currently] running/processing on 'this' cpu fields. In the above examples, we see that errdemon and idleproc00 are bound to cpu 0. If there is an X in field R, then the process is swapped out. You will then most likely see a ? in the COMMAND filed. The ST field indicates the state of the process. RUN means that it is waiting to run and ONPROC means that it is the currently running process.

The ps ADDRESS is used by many kcrash macros as it is the process address.

Examining an individual process:

As stated above, there are several kcrash macros that use the process address as an argument. Probably one of the most important is the 'proc' macro. It displays much of the same information as ps, but for an individual process. It also gives us more information like the priority of the process the flags of the process, the size of the u-block, etc.

 S>proc D1565A00
 [D1565A00]   ONPROC   flag 502010: LOAD ULOAD EXECED INUSE
  pid 25417.   pri 3E   sig 00000000  ign 611E1000  kbd -  ubd -    uid 00106.
 ppid 25405.  csig 00  mask 00000000  cid 00000002  run 0          usiz 0002
 pgrp 25417.  cflt 00  hold 00000000   as D14A3740  mutx D1565B8C  segu F0FC0000
 p_exec D1223508 p_cred D14F2780 p_ubptbl C2059F00
 u.u_psargs = "./spo "

This shows us that the flags set are LOAD, ULOAD EXECED, and INUSE which means that process is loaded in core, the u-block is in core, the process was execed and that it is running. The "pri" field tells us that the priority of the process is 3E. There is no signal set, there is no masking, the scheduling class id is 2, the process is running on processor 0, and the size of the user block is 2 (*4096 bytes). The entire proc structure is defined in /usr/include/sys/proc.h. Only a small portion of the user structure is displayed by this macro.

Process address space:

The proc macro also gives us the address of a process' address space. By using this address, we can see the size of the address space, and the amount of physical memory the process is currently using. The macro we use is the 'as' macro.

 S>as D14A3740
 as [D14A3740]: keepcnt 000000 segs D156EDA0 seglast D156ECC0
 size 7CD000 rss 510 hat[D14FD140 D14FED20]

So, the size of this process' virtual address space is 7CD000 (approx. 8MB). Currently it is utilizing 1296 pages (around 5.3MB) of physical memory.

The as macro provides us with the address which points to all the segments of the process. These segments can be viewed using the 'segn' macro:

S>segn D156EDA0
  addr     base      end    size    as      data   physical
D156EDA0 00000000 00000FFF 01000 D14A3740 D15176B4 12F4000
D15D1880 08040000 08047FFF 08000 D14A3740 D165E848
D151E680 08048000 08085FFF 3E000 D14A3740 D1546D68 2DE2000
D151E320 08086000 0809EFFF 19000 D14A3740 D154CD8C 16B8000
D151E340 0809F000 083C9FFF 32B000 D14A3740 D154CDB0 1C85000
D1585100 80000000 80036FFF 37000 D14A3740 D1517C48 131E000
D1510BA0 80037000 80038FFF 02000 D14A3740 D140726C 1CDE000
... (lines deleted for brevity)

The base address is the virtual address that the segment is mapped at. The size is the size of the segment. The data address contains information regarding the type of data contained in the segment and how it is mapped. The last field shows the physical address of the segment.

A process' user information:

The proc structure also gives us the address of the user segment which is designated by 'segu'. This address is used by both the user and User macros. These macros display information about the user structure of a process. The User macro is available in 1.3. It displays the same information as the user macro did in 1.2.

For now, we are only going to discuss the information given by the User macro.

  S>User F0FC0000
  PER PROCESS USER AREA FOR PROCESS D1565A00
        command: spo psargs: ./spo
        proc slot: 058  start: Wed Jun 28 00:11:28 1995
        mem: 510,  type: exec su-user

Here, the mem value should be equal to the rss value given by the as macro. The value is again displayed in hexadecimal pages.

   proc/text lock:none

The process is not using any text locks at this time.

   error 0000    ticks 0006681502.
        syscall 0004  args are:

The system call is a write and the arguments passsed were:

  00000007 080A8F34 000000EC 00000000 08047D64 00003000 00000000 00000000

see syscall(3) and /usr/include/sys/sycall.h for more information.

   vnode of current directory: D1215EA8

We use the vnode address to display more information about the process. However, the User macro only displays the vnode of the current directory.

  PROCESS MISC:
  OPEN FILES AND POFILE FLAGS:
        [00]: F D161DF00, 0      [01]: F D1529D80, 0
        [02]: F D14F8C80, 0      [03]: F D14F8C80, 0
        [04]: F D1529D80, 0      [05]: F D140AE00, 0
        [06]: F D15B7900, 0      [07]: F D1563980, 1
        [08]: F D1695480, 1      [09]: F D161C980, 2
        [10]: F D14C6B00, 0      [11]: F D142E780, 0
        [13]: F D159CE00, 2      [16]: F D15B7500, 2

Above are the addresses of all the open files. To better understand what this is showing us, let us look at one example:

  [08]: F D1695480, 1

The F is the protection flag which in this case is PROT_ALL. What this means is the pages of this file can be read, written, and executed and are user accessible. The next field is actually the address of the file and the last field is the type of sharing that is set for the file. In this case, the file has share type of MAP_SHARED which allows any changes to be shared. A value of 2 would mean that any changes to the file would be private. See /usr/include/sys/mman.h.

 FILE I/O:
        u_base: 00000000, file offset: 0, bytes: 0,
        segment: data, cmask: 0002

This information shows us that the segment is a data segment. The cmask is the file creation mask which is better known as the umask or permissions that are set for file creation.

  RESOURCE LIMITS:
        cpu time: unlimited/unlimited
        file size: 134217728/8388608
        swap size: 16777216/16777216
        stack size: 16777216/16777216
        coredump size: 16777216/16777216
        file descriptors: 64/128
        address space: 16777216/16777216
        file mode(s):

This shows the kernel defined limits for the rlimit structure. i.e. the parameters like SCPULIM, SFSZLIM, SDATLIM, etc. See the "Tuning Guide" for more information regarding the rlimit structure.

  SIGNAL DISPOSITION:
        00:  default  01:  default  02:  default  03:  default  
        04:  default  05:  default  06:  default  07:  default  
        08:  default  09:  default  10:  default  11:  default 
        12:  ignore   13:  default  14:  default  15:  default  
        16:  default  17:  default  18:  default  19:  default 
        20:  default  21:  default  22:  default  23:  default
        24:  default  25:  default  26:  default  27:  default  
        28:  default  29:  ignore   30:  ignore   31:  default

The last portion of the User macro displays the signal dispostions of the process. All of the signals are defined in /usr/include/signal.h.

So, the User macro is quite useful. We can see how many open files an individual process has, the resident set size of the process, etc. It is also useful because it gives us the vnode address and the address of the files that the process is using. These allow us to run the vnode and the File macros. Like the proc macro, the User macro does not display the entire user structure as defined in /usr/include/sys/user.h.

The virtual node:

A vnode is allocated for every active file, in use directory, each mounted file including root. With this in mind, it is no wonder that the vnode is considered the focus of all file activity. The User macro provided us with a vnode address, so let us look at what the vnode macro displays.

 S>vnode D1215EA8
 [D1215EA8]  type 2: VDIR  flag 0000:
 count   00005.  vfsmnt 00000000   ops D019EE98  stream 00000000  locks 00000000
  rdev 00000000    vfsp D1352480  data D1215D10   pages 00000000

As shown by the User macro, the type of vnode, in this example, is "VDIR" or current directory. The vnode flags and the vnode structure are defined in /usr/include/sys/vnode.h. The ops address is a pointer to another structure called vnodeops. If there was a stream, a lock, a device (block or character), or pages associated with this vnode, we would see an address in these fields. We see that addresses are also given for vfsp which points to the virtual file system and data which is a pointer to the private data.

Files

From the files list given by the User macro, let us look at file 13 using the File macro.

 S>File D159CE00
 f_next     D1641B00                <- pointer to next entry
 f_prev     D15B7500                <- pointer to previous entry
 f_flag     00000003  FREAD FWRITE
 f_count    00000004                <- reference count
 f_vnode    D1242598                <- pointer to vnode structure of the file
 f_offset   00000000                <- read/write character pointer
 f_cred     D14F2780                <- pointer to the cred structure
 f_aiof     D159CE18                <- aio file list forward link
 f_aiob     D159CE18                <- aio file list backward link
 f_off      D159CE20
 f_slnk     D159CE20
 f_lck      D159CE24                <- pointer to the mutex lock structure
 f_clcount  00000000
 f_clwant   00000000

The File macro not only displays a good portion on the file structure as defined in /usr/include/sys/file.h, but it also runs the vnode macro using the vnode of the file, the stream macro and the cred macro.

 vnode
 [D1242598]  type 1: VREG  flag 0000:
 count   00002.  vfsmnt 00000000   ops D019EE98  stream 00000000  locks D161CA00
  rdev 00000000    vfsp D1352480  data D1242400   pages 00000000

 stream

There was not a stream associated with this file.

 cred
 cr_uid   00106  cr_gid   00103  cr_ruid  00106  cr_rgid  00103
 cr_suid  00106  cr_sgid  00103
 cr_ref 09A  cr_ngroups  001  cr_groups   00000067

The cred structure is defined in /usr/include/sys/cred.h. It shows the credentials of the file. The most important fields here are the 'id' fields. For example, the cr_uid field shows the effective user id, cr_ruid is the real user id, and the cr_suid is the saved user id.

We have discussed how to create and restore dumps as well as how to perform some basic analysis in Part 1. For those of you wanting to delve further into UNIX internals, proceed to: KCRASH PART 2

KCRASH PART 1 : A User's Guide to Dump Analysis