United States (change)
Shortcuts: Downloads Fedora Red Hat Network
Account Links: Cart Your Account Logout
Release Found: Red Hat Enterprise Linux 4 Update 2
Solution:
The README file in the diskdumputils-1.1.9 RPM should not be referred in order to configure a swap partition for diskdump. Otherwise the partition becomes disabled. The inappropriate README file leads users to corrupt the swap partition because the instruction makes them format a swap partition. Dump devices except for swap partitions always require service diskdump initialformat in order to use them as diskdump dedicated devices. In other words, swap partitions should not be formatted. If you want further information about that, see the "Setup" section in the latest README file.
The diskdumputils package is installed on the machine that you wish to capture dumps on, in the event of a system panic. It loads and configures the diskdump kernel modules so that if the machine crashes, the memory dump will be dumped to disk.
Diskdump is only supported with the following storage adapters:
aic7xxx aic79xx ipr megaraid mptfusion sym53c8xx sata_promise ata_piix
To find out if your system has the following drivers in use by your hardware, use the command:
lsmod
Disk dump is supported in the following Red Hat kernels, where <kernel-version> is the version containing this diskdumputils package:
kernel-<kernel-version>.i686.rpm kernel-smp-<kernel-version>.i686.rpm kernel-hugemem-<kernel-version>.i686.rpm kernel-<kernel-version>.ia64.rpm kernel-<kernel-version>.ppc64.rpm kernel-<kernel-version>.x86-64.rpm kernel-smp-<kernel-version>.x86-64.rpm
Diskdump is supported from Red Hat Enterprise Linux 4 Update 2 and up.
Dump Device Selection
The first step in the configuration process is to designate a disk device to dump memory to in the event of a system crash. The dump device may be any of the following:
If you configure dumping to the swap partition, it is required that both /usr and /var must be mounted locally; for reasons described in the remainder of this paragraph.
In the event of a system crash, the memory contents are saved to /var/crash. When the system reboots, the diskdumputils commands are run to preserve the saved memory, which it must read off /var/crash. The diskdump commands themselves are mounted under /usr. This memory saving operation is run in the boot sequence prior to both enabling swap and mounting remote filesystems. If /usr and /var were mounted remotely, the diskdump service would fail because remote file systems are usually mounted later than the swap initialization in rc.sysinit.
The size of dump device should be large enough to save the whole dump. The dump size to be written consists of the size of whole physical memory plus a header field. To determine the exact size required, refer to the output of /proc/diskdump after the diskdump module is loaded:
# modprobe diskdump # cat /proc/diskdump # sample_rate: 8 # block_order: 2 # fallback_on_err: 1 # allow_risky_dumps: 1 # total_blocks: 262042
The total block size is shown by page-size units, so in this example, the selected device must contain at least (262042 * 4096) bytes on an i386 machine.
Note: During a diskdump operation, memory contents residing on the swap partition are not preserved. Therefore the dump partition size corresponds to physical memory; rather than physical memory plus the size of the swap partition.
Next, based on the information above, consider which devices you select as a dump device. To do that, follow the instructions below.
Edit /etc/sysconfig/diskdump appropriately in the following format to register a dump device:
DEVICE=/dev/sde1
Multiple dump devices can be registered in a colon-separated format like:
DEVICE=/dev/sda2:/dev/sdb
The benefit of designating more than one dump device is redundancy. For example, if each dump device was controlled by a different driver, even if a system panic occurred in a driver that controls one of the registered devices, the memory could be dumped out using the other registered device. In this case it is required that each dump device be sufficiently large to store the full dump. Multiple dump devices are not supported if you are dumping to a swap device. Consequently, designating both a swap device and dedicated dump partition is not allowed.
Dump Device Formatting
Note: Skip this step if you are dumping to the swap partition.
The second step in the configuration process involves formatting the dump device.
Any dump device other than a swap partition which is registered as a dump device needs to be specially formatted for diskdump before being used. Accordingly, the designated dump partition cannot be used to create a conventional filesystem on it.
The dump device formatting needs to be done once by the system administrator. (Note: This step must be skipped if you configured a swap partition as a dump device. Otherwise the swap partition becomes unusable for swapping because it is formatted as a diskdump-dedicated device.):
# service diskdump initialformat
Enable Diskdump Service
Lastly, start the diskdump service:
# chkconfig diskdump on # service diskdump start
The registered device/partition can be referred through /proc/diskdump interface.
# cat /proc/diskdump /dev/sde1 514080 1012095
If the registered dump device needs to be replaced, edit /etc/sysconfig/diskdump. Format the new dump device as described above. Then restart the diskdump service. To restart the service, run the command below.
# service diskdump restart
To test the diskdump functionality, use Alt-SysRq-C or echo c > /proc/sysrq-trigger. After completing the dump, a vmcore file will be created during the next reboot sequence and saved in a directory with a name of the name format:
/var/crash/127.0.0.1-<date>
The vmcore file's format is same as that created by the netdump facility, so you can use the crash(8) command to analyze it
Once you set up, it is not necessary to do anything after that. After the initial configuration process there are no additional steps required. Be sure to keep the designated dump partition to be sufficiently large. If there is not enough space, the dump file will be partially saved; resulting in an incomplete dump file named vmcore-incomplete.
Diskdump currently contains one customizable script file called diskdump-nospace. The diskdump-nospace script is called prior to the creation of the vmcore file if /var/crash does not have enough space to hold the complete dumpfile. The script may be customized to clean up enough space for the dump in question to proceed.
The diskdump module has following module parameters:
block_order: Specifies the dump-time I/O block size. Default value is 2, which sets the I/O block size equal to page-size << 2, or 16 kbytes on an i386 machine. Larger values may make for better performance, but occupies more module memory.
sample_rate: Determine how many blocks in the dump partition are verified before actual memory dumping begins. Default value is 8, which means one of every 1<<8 (256) blocks are verified. Specifying zero means all blocks in the partition are verified, and a negative value disables verification.
dump_level: A memory collection level that specifies which memory pages will be dumped. Default value of 0 dumps all pages of physical RAM into the vmcore file. To avoid excessively large vmcore files, page cache pages, zero-filled pages, free pages, and user application pages may be eliminated from the file. Specifying one of the dump_level values from 1 to 15 will skip one or more memory page type(s) if that page type is marked with an X in the following table:
dump cache zero free user description level page page page page --------------------------------------------------------- 0 default 1 X 2 X 3 X X recommended 4 X 5 X X 6 X X 7 X X X 8 X 9 X X 10 X X 11 X X X 12 X X 13 X X X 14 X X X 15 X X X X minimum dump size
This partial dump feature provides a memory collection level that can select the amount of physical memory that is dumped. All of physical memory is usually not required to investigate a kernel issue. Most of physical memory typically contains user application data, page cache memory (file data), free memory pages, and zero-filled pages. By skipping one of more of those page types when creating the vmcore file, the crash dump will be significantly smaller, and the dump procedure less time-consuming. While the actual vmcore file size may vary because of the status of system and the dump_level specified, the minimum amount of data required to analyze the dump will always be captured. However, since there may be circumstances where it will be necessary to capture all of physical memory, it is not recommended that a dump partition size be less than the actual amount of physical memory.
Note that the partial dump feature has some risks. There are memory management lists which are scanned for a page's memory attribute, so if the list has been corrupted, the scanning process may fail. For example, when specifying a dump_level from 4-7 or from 12-15, the kernel's free page linked lists are scanned; if the list is corrupt, diskdump may hang. Furthermore, it is possible that a page type that has been skipped may be necessary to fully investigate the cause of some issues. Therefore, a memory collection level should be selected to suit each situation. The recommended level is 3, because it is easiest to determine whether a page is zero-filled or if it is a page cache page, and because no page lists need to be traversed.
Example:
The following option sets I/O block size to 32 kbytes, and verification is done on every block in the partition. Also, cache page and zero page are skipped by partial dump feature.