Converting Excel into JSON using Python

34,389

Solution 1

Went through a couple of solutions, this is the one that worked best for me. Hope this can save someone else some time.

import pandas
import json

# Read excel document
excel_data_df = pandas.read_excel('data.xlsx', sheet_name='sheet1')

# Convert excel to string 
# (define orientation of document in this case from up to down)
thisisjson = excel_data_df.to_json(orient='records')

# Print out the result
print('Excel Sheet to JSON:\n', thisisjson)

# Make the string into a list to be able to input in to a JSON-file
thisisjson_dict = json.loads(thisisjson)

# Define file to write to and 'w' for write option -> json.dump() 
# defining the list to write from and file to write to
with open('data.json', 'w') as json_file:
    json.dump(thisisjson_dict, json_file)

Solution 2

I prefer using xlrd to convert the Excel rows into a JSON format.

import xlrd
from collections import OrderedDict
import json

Open the workbook and select the first worksheet

wb = xlrd.open_workbook("Excel-sheet location here")
sh = wb.sheet_by_index(0)

Create a list to hold dictionaries

data_list = []

Iterate through each row in worksheet and fetch values into dict

for rownum in range(1, sh.nrows):
     data = OrderedDict()

row_values = sh.row_values(rownum)
data['<Column Name1>'] = row_values[0]
data['<Column Name2>'] = row_values[1]
data_list.append(data)

Write to file:

 with open("RulesJson.json", "w", encoding="utf-8") as writeJsonfile:
      json.dump(data_list, writeJsonfile, indent=4,default=str) 

Solution 3

There are 2 approaches for achieving the result:

  1. Using excel2json. It's quite a simple tool but might be helpful for you.
    • First, install the package excel2json-3 using pip.
    • Then, running this code block should output a JSON file for each sheet in the file:
import excel2json

excel2json.convert_from_file('excel_form.xlsx')

  1. Using pandas. If you are looking for a more comprehensive solution, you might as well find pandas useful. It is a library made for data manipulation and has many more features.
    • First install pandas via pip.
    • Then, running this code should result with printing a JSON string describing the excel sheet called Sheet.
import pandas

excel_data_fragment = pandas.read_excel('excel_form.xlsx', sheet_name='Sheet')

json_str = excel_data_fragment.to_json()

print('Excel Sheet to JSON:\n', json_str)

Link to the source

Share:
34,389

Related videos on Youtube

NW_92
Author by

NW_92

Updated on February 26, 2022

Comments

  • NW_92
    NW_92 about 2 years

    I have an excel file and I want to convert it into a JSON file. So the excel is something like this:

    -------------------------
    | Col A | Col C | Col F |
    --------+-------+--------
    |   1   |   A   |   EE  |
    |   2   |   B   |   FF  |
    |   4   |   C   |   FF  |
    |   5   |   D   |   HH  |
    |   6   |   D   |   HH  |
    |   7   |   A   |   EE  |
    |   8   |   E   |   EE  |
    --------------------------
    

    I would like the JSON to follow this format:

    {
    "EE": {
        "A": {
          "Col A key": "1",
          "Col A key": "7"
        },
        "E": {
          "Col A key": "8"
        },
      },
    
    "FF": {
        "B": {
          "Col A key": "2"
        },
        "C": {
          "Col A key": "4"
        }
      },
    
    "HH": {
        "D": {
          "Col A key": "5",
          "Col A key": "6"
        }
      }
    
    }
    

    Can anyone help me implement this using python? I've tried various ways but without success. This is what I have done so far:

    
    import openpyxl, pprint, json
    print('Opening workbook...')
    wb = openpyxl.load_workbook('excel_form.xlsx')
    sheet = wb.get_sheet_by_name('Sheet')
    
    
    excel_data = {}
    print('Reading rows...')
    for row in range(2, sheet.max_row + 1):
        Col F  = sheet['F' + str(row)].value
        Col C = sheet['C' + str(row)].value
        Col A = sheet['A' + str(row)].value
    
        excel_data.setdefault(Col F, {})
        excel_data[Col F].setdefault(Col C, {'Col A': Col A})
    
    
    # Open a new text file and write the contents of excel_data to it.
    print('Writing results...')
    with open('DATA.json', 'w') as resultFile:
        json.dump(Matrix, resultFile)
    print('Done.')
    

    Thanks in advance

    • Scott Hunter
      Scott Hunter over 4 years
      What is the specific problem with the posted code?
    • Plato77
      Plato77 over 4 years
      How about this previous answer: stackoverflow.com/questions/53940031/…
    • Massimo Di Saggio
      Massimo Di Saggio over 4 years
      Imo you should consider using the simplejson package instead of trying to reinvent the wheel.
  • muthu
    muthu almost 4 years
    It thows AttributeError: module 'excel2json' has no attribute 'convert_from_file' error
  • aydow
    aydow about 3 years
    pandas is a bit of overkill if you just want to read in the data. if you want to transform the data in some way that leverages pandas, then that would be more ideal.
  • s.bramblet
    s.bramblet about 2 years
    NameError: name 'json' is not defined
  • Michael Behrens
    Michael Behrens almost 2 years
    Worked for me after I changed 'sheet1' to 'Sheet1', as in my case, the default sheet name had a capital S. I also had to install two packages with pip3: pandas and openpyxl.