How to decode an encoded excel file using python

16,070

Solution 1

You're almost there. Since the decrypted object is a bytes string, why not use BytesIO?

import io
import pandas as pd

toread = io.BytesIO()
toread.write(decrypted)  # pass your `decrypted` string as the argument here
toread.seek(0)  # reset the pointer

df = pd.read_excel(toread)  # now read to dataframe

Answering your question from your comment: How to convert a df to a binary encoded object?

Well, if you want to convert it back to an b64 encoded object with pandas converting it as an excel, then:

towrite = io.BytesIO()
df.to_excel(towrite)  # write to BytesIO buffer
towrite.seek(0)  # reset pointer
encoded = base64.b64encode(towrite.read())  # encoded object

To write the encoded object to a file (just to close the loop :P):

with open("file.txt", "wb") as f:
    f.write(encoded)

Solution 2

You can do with openpyxl module also Here is the modified code

import base64
import io
import openpyxl

with open('encoded_data.txt','rb') as d:
    data=d.read()
print(data)
decrypted=base64.b64decode(data)
print(decrypted)

xls_filelike = io.BytesIO(decoded_data)
workbook = openpyxl.load_workbook(xls_filelike)
sheet_obj = workbook.active
max_col = sheet_obj.max_column 
max_row = sheet_obj.max_row

# Will print all the row values
for i in range(1, max_row +1):
    for j in range(1, max_col + 1):         
        cell_obj = sheet_obj.cell(row = i, column = j) 
        print cell_obj.value, 
        print ",", "Inorder to seperate the cells using comma for readability
    print ""
Share:
16,070

Related videos on Youtube

pyd
Author by

pyd

Just a Tech.

Updated on June 04, 2022

Comments

  • pyd
    pyd almost 2 years

    My java programmer converted an excel file to binary and sending the binary content to me.

    He used sun.misc.BASE64Encoder and sun.misc.BASE64Decoder() for encoding.

    I need to convert that binary data to a data frame using python.

    the data looks like,

    UEsDBBQABgAIAAAAIQBi7p1oXgEAAJAEAAATAAgCW0NvbnRlbnRfVHl........

    I tried bas64 decoder but not helped.

    my code:

    import base64
    with open('encoded_data.txt','rb') as d:
        data=d.read()
    print(data)
    `UEsDBBQABgAIAAAAIQBi7p1oXgEAAJAEAAATAAgCW0NvbnRlbnRfVHl........`
    decrypted=base64.b64decode(data)
    print(decrypt)
      'PK\x03\x04\x14\x00\x06\x00\x08\x00\x00\x00!\x00b\xee\x9dh^\x01\x00\x00\x90\x04\x00\x00\x13\x00\x08\x02[Content_Types].xml \xa2\x04\x02(\xa0\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
    

    Please help me to convert this binary data to a pandas dataframe.

  • Pyd
    Pyd over 5 years
    Thank you so much, worked perfectly. I have not used io module
  • Scratch'N'Purr
    Scratch'N'Purr over 5 years
    @pyd Updated my answer to address your comment :)
  • Pyd
    Pyd over 5 years
    AttributeError: '_io.BytesIO' object has no attribute 'write_cells' getting this error on df.to_excel(towrite)
  • Scratch'N'Purr
    Scratch'N'Purr over 5 years
    You are probably using a different excel writer engine than me. This post should help: stackoverflow.com/questions/28058563/…
  • Pyd
    Pyd over 5 years
    Ok I'll try that one
  • Yamur
    Yamur about 3 years
    What is this? xls_filelike = io.BytesIO(decoded_data)? where is decoded_data?