reorder byte order in hex string (python)

python string hex swap python-2.x

33,179

Solution 1

array.arrays have a byteswap method:

import binascii
import struct
import array
x = binascii.unhexlify('b62e000052e366667a66408d')
y = array.array('h', x)  
y.byteswap()
s = struct.Struct('<Id')
print(s.unpack_from(y))

# (46638, 943.2999999994321)

The h in array.array('h', x) was chosen because it tells array.array to regard the data in x as an array of 2-byte shorts. The important thing is that each item be regarded as being 2-bytes long. H, which signifies 2-byte unsigned short, works just as well.

Solution 2

This should do exactly what unutbu's version does, but might be slightly easier to follow for some...

from binascii import unhexlify
from struct import pack, unpack
orig = unhexlify('b62e000052e366667a66408d')
swapped = pack('<6h', *unpack('>6h', orig))
print unpack('<Id', swapped)

# (46638, 943.2999999994321)

Basically, unpack 6 shorts big-endian, repack as 6 shorts little-endian.

Again, same thing that unutbu's code does, and you should use his.

edit Just realized I get to use my favorite Python idiom for this... Don't do this either:

orig = 'b62e000052e366667a66408d'
swap =''.join(sum([(c,d,a,b) for a,b,c,d in zip(*[iter(orig)]*4)], ()))
# '2eb60000e3526666667a8d40'

Solution 3

The swap from 'data_string_in_orig' to 'data_string_in_swapped' may also be done with comprehensions without using any imports:

>>> d = 'b62e000052e366667a66408d'
>>> "".join([m[2:4]+m[0:2] for m in [d[i:i+4] for i in range(0,len(d),4)]])
'2eb60000e3526666667a8d40'

The comprehension works for swapping byte order in hex strings representing 16-bit words. Modifying it for a different word-length is trivial. We can make a general hex digit order swap function also:

def swap_order(d, wsz=4, gsz=2 ):
    return "".join(["".join([m[i:i+gsz] for i in range(wsz-gsz,-gsz,-gsz)]) for m in [d[i:i+wsz] for i in range(0,len(d),wsz)]])

The input params are:

d : the input hex string

wsz: the word-size in nibbles (e.g for 16-bit words wsz=4, for 32-bit words wsz=8)

gsz: the number of nibbles which stay together (e.g for reordering bytes gsz=2, for reordering 16-bit words gsz = 4)

33,179

Author by

Wolfgang R.

Updated on July 13, 2022

Comments

Wolfgang R. almost 2 years
I want to build a small formatter in python giving me back the numeric values embedded in lines of hex strings.

It is a central part of my formatter and should be reasonable fast to format more than 100 lines/sec (each line about ~100 chars).

The code below should give an example where I'm currently blocked.

'data_string_in_orig' shows the given input format. It has to be byte swapped for each word. The swap from 'data_string_in_orig' to 'data_string_in_swapped' is needed. In the end I need the structure access as shown. The expected result is within the comment.

Thanks in advance Wolfgang R
```
#!/usr/bin/python

import binascii
import struct

## 'uint32 double'
data_string_in_orig    = 'b62e000052e366667a66408d'
data_string_in_swapped = '2eb60000e3526666667a8d40'
print data_string_in_orig

packed_data = binascii.unhexlify(data_string_in_swapped)
s = struct.Struct('<Id')
unpacked_data = s.unpack_from(packed_data, 0)  
print 'Unpacked Values:', unpacked_data

## Unpacked Values: (46638, 943.29999999943209)

exit(0)
```
Kenan Banks over 11 years

array.byteswap. Sweet. Guess I'll go ahead and not post the kludgy unpack big-endian / repack little-endian solution I had cooking...
unutbu over 11 years

Go ahead and post it! Having more than one way to solve a problem can be useful.
Wolfgang R. over 11 years

Thanks, this was fast and perfect for me. By the way 100k lines in 5 sec.
Kenan Banks over 11 years

@WolfgangR. - if this solution worked for you, you should accept the answer.