Python: Converting excel file to JSON format

14,824

Solution 1

That should do it:

import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
    data = OrderedDict()
    row_values = sh.row_values(rownum)
    data['pattern'] = row_values[0]
    data['response'] = row_values[1]
    data_list.append(data)
data_list = {'intents': data_list} # Added line
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)

Note the added data_list = {'intents': data_list}.

Solution 2

Give a shot to pyexcel_xlsx library in python. I have used this for converting xlsx to json. Sweet and simple one. And fast also as compared to other python libraries.

Sample code:

from pyexcel_xlsx import get_data;
import time;
import json;

data = get_data("D:\\android\\testdata2.xlsx")
sheetName = "Table A";

data_list = []
# Iterate through each row and append in above list
for i in range(0, len(data[sheetName])):
    data_list.append({
        'pattern' : data[sheetName][i][0],
        'response' : data[sheetName][i][1]
    })
data_list = {'intents': data_list} # Converting to required object
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)
Share:
14,824
Pavan Rajput
Author by

Pavan Rajput

Updated on June 14, 2022

Comments

  • Pavan Rajput
    Pavan Rajput almost 2 years

    I am creating a ML model that will use a JSON file to understand the pattern and response format. As I have my data in excel format I converted it to JSON in python.

    Here is the code:

    import xlrd
    from collections import OrderedDict
    import simplejson as json
    # Open the workbook and select the first worksheet
    wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
    sh = wb.sheet_by_index(0)
    # List to hold dictionaries
    data_list = []
    # Iterate through each row in worksheet and fetch values into dict
    for rownum in range(1, sh.nrows):
        data = OrderedDict()
        row_values = sh.row_values(rownum)
        data['pattern'] = row_values[0]
        data['response'] = row_values[1]
        data_list.append(data)
    # Serialize the list of dicts to JSON
    j = json.dumps(data_list)
    # Write to file
    with open('data1.json', 'w') as f:
        f.write(j)
    

    I am the getting the output as:

    [{
        "pattern": "WALLSTENT NON COUVERTE ",
        "response": "ENDOPROTHESE STENT  VASCULAIRE "
    }, {
        "pattern": "PRIMEADVANCED SURSCAN MRI ",
        "response": "NEUROSTIMULATEUR NERF VAGUE GAUCHE "
    }, {
        "pattern": "AVASTIN  FLACON DE",
        "response": "BEVACIZUMAB"
    }, {
        "pattern": "PERJETA SOLUTION A DILUER POUR PERFUSION",
        "response": "BRENTUXIMAB VEDOTIN"
    }]
    

    The desired output I am looking for is like this:

    {
        "intents": [{
            "pattern": ["WALLSTENT, NON, COUVERTE "],
            "response": ["ENDOPROTHESE STENT  VASCULAIRE] "
        }, {
            "pattern": ["PRIMEADVANCED ,SURSCAN ,MRI"] ,
            "response": ["NEUROSTIMULATEUR NERF VAGUE GAUCHE "]
        }, {
            "pattern": ["AVASTIN , FLACON ,DE"],
            "response": ["BEVACIZUMAB"]
        }, {
            "pattern": ["PERJETA, SOLUTION, A, DILUER, POUR ,PERFUSION"],
            "response": ["BRENTUXIMAB VEDOTIN"]
        }]
    }
    

    What modification can I do in my function to get the output I am looking for.

  • Pavan Rajput
    Pavan Rajput over 5 years
    i added the line and it gives this error as i run the program. "list indices must be integers or slices, not str"
  • TheNavigat
    TheNavigat over 5 years
    Apologies, I fixed it.
  • TheDevOpsGuru
    TheDevOpsGuru almost 5 years
    Note that in the python shell there needs to be a new line between data_list.append(data) and data_list = {'intents': data_list}