SmarttBits

Better living through hexadecimal.

Reversing Bits and Bytes

| Comments

Introduction

Bits and Bytes need to be reversed quite often. Some computers store data in a little-endian format and some in a big-endian format. In order for communication between these two hosts to work, these conversions must be done with extreme precision. When coding at a high level, such as using the Socket object in a Java program, one doesn’t need to worry about these issues. As the data flows down the OSI model from the Application layer to the Physical layer, our operating system kernel will do all the endian conversions as necessary.

But this may not always be the case. And even when it is, having a good understanding of whats going on under the hood can be very beneficial for debugging. A few examples of which you may be interested in endianness and the flipping of bits/bytes are: - Raw Sockets in C - Writing exploits - Cryptography - Assembly Language

Understanding the difference

Distinguishing between flipping bits and flipping bytes is a must for hackers. Let’s take the example 0xdeadbeef. For the sake of simplicity, we’ll assume an atomic variable size of 2 bytes or 16 bits. This is standard on Linux kernel 2.6.39.* and Windows XP SP3.

Let’s first reverse bytes. There are two steps: 1. Split into single byte (8 bit) chunks:

1
2
0   1   2   3
de  ad  be  ef

reverse the order of the bytes:

1
2
3   2   1   0
ef  be  ad  de

As you can see, when we reverse bytes in hexadecimal notation, the characters are reordered, but not modified. This is not necessarily true with bit reversal.

Here’s how you can reverse the bits by hand. start by writing out each hexadecimal byte in binary (eg. 0xde becomes 1101 1110). On the next line, write the binary string backwards. My favorite way to do this is retype the binary string, but after entering each bit, use the left arrow key to reposition the cursor in front of what you just typed. You can either remove the spaces shown below separating each byte before this conversion or treat them as part of the string we are reversing. If you don’t believe me, try some trivial examples with some short english words. (This method works for the above byte reversal, but you need to type two characters at a time and press the left arrow key twice to reposition yourself infront of the byte you just typed.)

1
2
0xdeadbeef = 1101 1110 1010 1101 1011 1110 1110 1111
0xf77db57b = 1111 0111 0111 1101 1011 0101 0111 1011

Reversing bits is trivial, the above method can be used, but each chunk is a single bit rather than a byte. Let’s look at bit reversal in hexadecimal. In order to compute these results yourself, write out 0xdeadbeef in binary form, and then write it in reverse.

1
2
0-15  16-31  32-47  48-63
de    ad     be     ef

reverse the order of the bits:

1
2
0-15  16-31  32-47  48-63
f7    7d     b5     7b

Code

An efficient method of bit reversal is to exchange adjacent single bits, followed by exchanging adjacent 2-bit fields, followed by adjacent 4-bit fields, etc. For a 32-bit register, the final swap is on adjacent 16-bit fields, but more generally you want to stop at 2^(n-1). If you think about it, a swap any larger than this is superfluous.

Here is the code:

reverse bits example (reverse_bits.c) download
1
2
3
4
5
6
7
8
9
10
11
12
13
int main(int argc, char **argv)
{
  unsigned int x = 0x01234567;
  
  /* note: the order of these assignment statements is irrelevant. */
  x = (x & 0x55555555) << 1  | (x & 0xAAAAAAAA) >> 1;
  x = (x & 0x33333333) << 2  | (x & 0xCCCCCCCC) >> 2;
  x = (x & 0x0f0f0f0f) << 4  | (x & 0xf0f0f0f0) >> 4;
  x = (x & 0x00ff00ff) << 8  | (x & 0xff00ff00) >> 8;
  x = (x & 0x0000ffff) << 16 | (x & 0xffff0000) >> 16;
  
  return 0;
}

Furthermore, we can leverage the power of this algorithm by making each assignment dependent upon a conditional if statement. Using certain combinations of this assignment sequence we can do a range of tasks involved in bit fiddling. There are many many useful applications of this algorithm (for other values of k), and it is left as an exercise for the reader to find the rest of them.

Here is the improved code:

reverse bits extended (reverse_bits2.c) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#define REVERSE_BITS 31
#define REVERSE_BYTES 24
#define REVERSE_BITS_INEACH_BYTE 7
#define REVERSE_HALFWORDS_INEACH_WORD 16 

int main(int argc, char **argv)
{
  unsigned int x = 0x01234567, k = REVERSE_BITS;
  
  /* note: the order of these assignment statements is irrelevant. */
  if(k & 1)  x = { (x & 0x55555555) << 1  | (x & 0xAAAAAAAA) >> 1; }
  if(k & 2)  x = { (x & 0x33333333) << 2  | (x & 0xCCCCCCCC) >> 2; }
  if(k & 4)  x = { (x & 0x0f0f0f0f) << 4  | (x & 0xf0f0f0f0) >> 4; }
  if(k & 8)  x = { (x & 0x00ff00ff) << 8  | (x & 0xff00ff00) >> 8; }
  if(k & 16) x = { (x &           ) << 16 | (x &           ) >> 16; }
  
  return 0;
}

Comments