Best way to replace \x00 in python lists?

28,322

Solution 1

>>> L = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
>>> [[x[0]] for x in L]
[['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
>>> [[x[0].replace('\x00', '')] for x in L]
[['.text'], ['.data'], ['.rsrc']]

Or to modify the list in place instead of creating a new one:

for x in L:
    x[0] = x[0].replace('\x00', '')

Solution 2

lst = (i[0].rstrip('\x00') for i in List)
for j in lst: 
   print j,

Solution 3

Try a unicode pattern, like this:

re.sub(u'\x00', '', s)

It should give the following results:

l = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
for x in l:
    for s in l:
        print re.sub(u'\x00', '', s)
        count += 1

.text
.data
.rsrc

Or, using list comprehensions:

[[re.sub(u'\x00', '', s) for s in x] for x in l]

Actually, should work without the 'u' in front of the string. Just remove the first 3 slashes, and use this as your regex pattern:

'\x00'

Solution 4

What you're really wanting to do is replace '\x00' characters in strings in a list.

Towards that goal, people often overlook the fact that in Python 2 the non-Unicode string translate() method will also optionally (or only) delete 8-bit characters as illustrated below. (It doesn't accept this argument in Python 3 because strings are Unicode objects by default.)

Your List data structure seems a little odd, since it's a list of one-element lists consisting of just single strings. Regardless, in the code below I've renamed it sections since Capitalized words should only be used for the names of classes according to PEP 8 -- Style Guide for Python Code.

sections = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]

for section in sections:
    test = section[0].translate(None, '\x00')
    print test

Output:

.text
.data
.rsrc

Solution 5

I think a better way to take care of this particular problem is to use the following function:

import string

for item  in List:
  filter(lambda x: x in string.printable, str(item))

This will get rid of not just \x00 but any other such hex values that are appended to your string.

Share:
28,322
user2292661
Author by

user2292661

Updated on July 19, 2022

Comments

  • user2292661
    user2292661 almost 2 years

    I have a list of values from a parsed PE file that include /x00 null bytes at the end of each section. I want to be able to remove the /x00 bytes from the string without removing all "x"s from the file. I have tried doing .replace and re.sub, but not which much success.

    Using Python 2.6.6

    Example.

    import re
    
    List = [['.text\x00\x00\x00'], ['.data\x00\x00\x00'], ['.rsrc\x00\x00\x00']]
    
    while count < len(List):
        test = re.sub('\\\\x00', '', str(list[count])
        print test
        count += 1
    
    >>>test  (removes x, but I want to keep it) #changed from tet to test
    >>>data
    >>>rsrc
    

    I want to get the following output

    text data rsrc

    Any ideas on the best way of going about this?

  • Luka Rahne
    Luka Rahne about 11 years
    Yo dont need to make new lists or make replacements, where you can use iterators. They are free to make. They are literally transformation expression.
  • jamylak
    jamylak about 11 years
    @LukaRahne Are you talking about generator expressions? Anyway this is just a small example, depending on the OP's needs he can do that if he wants
  • user2292661
    user2292661 about 11 years
    is there a way to get rid of the brackets in the list to get just the data values? example. [['.text'], ['.data]] and I wanted to do a loop through the index say section in sectionlist then on the next line put section[0], it would give me the value ['text'], but I just want .text. How can you do that?