Hex dump

In computing, a hex dump is a hexadecimal view (on screen or paper) of computer data, from RAM or from a computer file or storage device. Looking at a hex dump of data is usually done in the context of either debugging or reverse engineering.

A hex dump of the 318 byte Wikipedia favicon

In a hex dump, each byte (8-bits) is represented as a two-digit hexadecimal number. Hex dumps are commonly organized into rows of 8 or 16 bytes, sometimes separated by whitespaces. Some hex dumps have the hexadecimal memory address at the beginning and/or a checksum byte at the end of each line.

Although the name implies the use of base-16 output, some hex dumping software may have options for base-8 (octal) or base-10 (decimal) output. Some common names for this program function are hexdump, hd, od, xxd and simply dump or even D.

Samples

A sample partial hex dump of a program, as produced by the Unix program hexdump:

 00105e0 e6b0 343b 9c74 0804 e7bc 0804 e7d5 0804
 00105f0 e7e4 0804 e6b0 0804 e7f0 0804 e7ff 0804
 0010600 e80b 0804 e81a 0804 e6b0 0804 e6b0 0804

The leftmost column is the hexadecimal address for the 16-bit values of the following columns. Every row represents 16 bytes (hexadecimal 10).

The above example, however, represents an ambiguous form of hex dump, as the byte order may be uncertain. Such hex dumps are good only in the context of a well-known byte order standard or when values are intentionally given in their full form (and may result in variable number of bytes), such as:

 00105e0 e6 b008 04e79e08 04e7bc 08 04 e7 d50804

When an explicit byte sequence is required (e.g. for hex dump of machine code programs or ROM content) a byte-by-byte representation is favored, commonly organized in 16-byte rows with an optional divider between 8-byte groups:

 00105e0 e6 b0 08 04 e7 9e 08 04-e7 bc 08 04 e7 d5 08 04
 00105f0 e7 e4 08 04 e6 b0 08 04-e7 f0 08 04 e7 ff 08 04
 0010600 e8 0b 08 04 e8 1a 08 04-e6 b0 08 04 e6 b0 08 04

Rarely, a condensed form is also used, without whitespace characters between values:

 00105e0 e6b00804e79e0804e7bc0804e7d50804
 00105f0 e7e40804e6b00804e7f00804e7ff0804
 0010600 e80b0804e81a0804e6b00804e6b00804

A Unix default display of those same bytes as two-byte words on a modern x86 (little-endian) computer would usually look like this:

 00105e0 b0e6 0408 9ee7 0408 bce7 0408 d5e7 0408
 00105f0 e4e7 0408 b0e6 0408 f0e7 0408 ffe7 0408
 0010600 0be8 0408 1ae8 0408 b0e6 0408 b0e6 0408

Often an additional column shows the corresponding ASCII text translation (e.g. hexdump -C or hd):

00000000 57 69 6b 69 70 65 64 69  61 2c 20 74 68 65 20 66  Wikipedia, the f
00000010 72 65 65 20 65 6e 63 79  63 6c 6f 70 65 64 69 61  ree encyclopedia
00000020 20 74 68 61 74 20 61 6e  79 6f 6e 65 20 63 61 6e   that anyone can
00000030 20 65 64 69 74 0a                                  edit
00000036

Checksum

When hex dumps are intended to be manually entered into a computer, such as was the case with print magazine articles of the home computer era, a checksum byte (or two) would be added at the end of each row, commonly calculated as simple 256 modulo of sum of all values in the row or a more sophisticated CRC. This checksum would be used to determine whether users entered the row correctly or not.

A variety of hex dump file formats  including S-record, Intel HEX, and Tektronix extended HEX  have a similar checksum value at the end of each row.

Compression of duplicate lines

In the Unix programs od and hexdump, not all lines of display output that contain the same data as the previous line are shown; instead, a line containing just one asterisk is displayed. For example, a block of all zeros is printed as:

 0000000 0000 0000 0000 0000 0000 0000 0000 0000
 *
 0000030

This compression feature makes a useful tool for inspecting large files or complete devices for irregularities. In a modern Linux system, it is convenient to scan an entire hard drive to check if it is all blank:

 # hexdump /dev/sda (replace sda with the proper name for the device to be scanned)

The -v option causes hexdump and od to display all input data, explicitly:

 0000000 0000 0000 0000 0000 0000 0000 0000 0000
 0000010 0000 0000 0000 0000 0000 0000 0000 0000
 0000020 0000 0000 0000 0000 0000 0000 0000 0000

od and hexdump

On Unix/POSIX/GNU systems: "The utilities od and hexdump output octal, hex, or otherwise encoded bytes from a file or stream. Depending on your system type, either or both of these two utilities will be available -- Previously, BSD systems deprecated od for hexdump, GNU systems the reverse. The two utilities, however, have exactly the same purpose, just slightly different switches."[1] However, since POSIX, in 2002, both FreeBSD [2] and GNU [3] reversed that decision, and both od and hexdump are fully supported.

DUMP, DDT and DEBUG

In the CP/M 8-bit operating system used on early personal computers, the standard DUMP program would list a file 16 bytes per line with the hex offset at the start of the line and the ASCII equivalent of each byte at the end.[4] Bytes outside the standard range of printable ASCII characters (20 to 7E) would be displayed as a single period for visual alignment. This same format was used to display memory when invoking the D command in the standard CP/M debugger DDT.[5] Later incarnations of the format (e.g. in the DOS debugger DEBUG) changed the space between the 8th and 9th byte to a dash, without changing the overall width.

This notation has been retained in operating systems that were directly or indirectly derived from CP/M, including DR-DOS, MS-DOS, OS/2 and MS-Windows. On Linux systems, the command hexcat produces this classic output format too. The main reason for the design of this format is that it fits the maximum amount of data on a standard 80 character wide screen or printer, while still being very easy to read and skim visually.

1234:0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
1234:0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
1234:0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
1234:0030: 20 65 64 69 74 00 00 00 00 00 00 00 00 00 00 00   edit...........

Here the leftmost column represents the address at which the bytes represented by the following columns are located. CP/M and various DOS systems ran in real mode on the x86 CPUs, where addresses are composed of two parts (base and offset).

In the above examples the final 00s are non-existent bytes beyond the end of the file. Some dump tools display other characters so that it is clear they are beyond the end of the file, typically using spaces or asterisks, e.g.:

1234:0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
1234:0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
1234:0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
1234:0030: 20 65 64 69 74                                    edit

or

1234:0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
1234:0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
1234:0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
1234:0030: 20 65 64 69 74 ** ** ** ** ** ** ** ** ** ** **   edit

See also

References

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.