Append Multiple Excel Files(xlsx) together in python

13,528

Solution 1

Let all_data be a list.

all_data = []
for f in glob.glob("output/test/*.xlsx"):
    all_data.append(pd.read_excel(f))

Now, call pd.concat:

df = pd.concat(all_data, ignore_index=True)

Make sure all column names are the same, otherwise this solution won't work.


You could also use a map version of the for loop above:

g = map(pd.read_excel, glob.glob("output/test/*.xlsx"))
df = pd.concat(list(g), ignore_index=True)

Or the list comprhension method as shown in the other answer.

Solution 2

Use list comprehension + concat:

all_data = [pd.read_excel(f) for f in glob.glob("output/test/*.xlsx")]
df = pd.concat(all_data, ignore_index=True)
Share:
13,528
user3821872
Author by

user3821872

Updated on June 07, 2022

Comments

  • user3821872
    user3821872 almost 2 years
    import pandas as pd
    import os
    import glob
    
    
    all_data = pd.DataFrame()
    for f in glob.glob("output/test*.xlsx")
        df = pd.read_excel(f)
        all_data = all_data.append(df, ignore_index=True)
    

    I want to put multiple xlsx files into one xlsx. the excel files are in the output/test folder. The columns are the same, in all but I want concat the rows. the above code doesn't seem to work

  • user3821872
    user3821872 over 6 years
    I want to fetch all the files in the folder 'test'. The test folder has the excel files in it
  • cs95
    cs95 over 6 years
    @user3821872 use "output/test/*.xlsx". See edit buddy.
  • user3821872
    user3821872 over 6 years
    I did do it, works, but I want it to be output in excel file, is there a way to do that? is there a way to see the output of the concat
  • cs95
    cs95 over 6 years
    @user3821872 Call df.to_excel('file.xlsx'). Further questions go in a new question post.
  • pylearner
    pylearner about 2 years
    am guessing this will cause memory issues when we have to run over 2000 files. Is there a way to reduce the memory while loading and writing them ?
  • pylearner
    pylearner about 2 years
    am guessing this will cause memory issues when we have to run over 2000 files. Is there a way to reduce the memory while loading and writing them ?
  • jezrael
    jezrael about 2 years
    @pylearner - simpliest processing first 1000 and then next 1000 files, it depends of your RAM size.
  • jezrael
    jezrael about 2 years
    and then concat both
  • pylearner
    pylearner about 2 years
    So you mean it wont fail for the 1000 files as well ? If I have very big files in those 1000 files, it may lead to heap memory issues. How do I handle this ?
  • jezrael
    jezrael about 2 years
    @pylearner - how many files not failed?
  • pylearner
    pylearner about 2 years
    680 ...681 throwed me a memory issue
  • jezrael
    jezrael about 2 years
    @pylearner - so concat each 500 files together and write ouput to files, last concat 4 final files - 0-500, 501-1000, 1001-1500, 1501-2000
  • pylearner
    pylearner about 2 years
  • pylearner
    pylearner about 2 years
    can you please reply in the room ?