How to copy worksheet from one workbook to another one using openpyxl?

32,626

Solution 1

I had the same problem. For me style, format, and layout were very important. Moreover, I did not want to copy formulas but only the value (of the formulas). After a lot of trail, error, and stackoverflow I came up with the following functions. It may look a bit intimidating but the code copies a sheet from one Excel file to another (possibly existing file) while preserving:

  1. font and color of text
  2. filled color of cells
  3. merged cells
  4. comment and hyperlinks
  5. format of the cell value
  6. the width of every row and column
  7. whether or not row and column are hidden
  8. frozen rows

It is useful when you want to gather sheets from many workbooks and bind them into one workbook. I copied most attributes but there might be a few more. In that case you can use this script as a jumping off point to add more.

###############
## Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
## Please add the ..path\\+\\file..  and  ..sheet_name.. according to your desire.

import openpyxl
from copy import copy

def copy_sheet(source_sheet, target_sheet):
    copy_cells(source_sheet, target_sheet)  # copy all the cel values and styles
    copy_sheet_attributes(source_sheet, target_sheet)


def copy_sheet_attributes(source_sheet, target_sheet):
    target_sheet.sheet_format = copy(source_sheet.sheet_format)
    target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
    target_sheet.merged_cells = copy(source_sheet.merged_cells)
    target_sheet.page_margins = copy(source_sheet.page_margins)
    target_sheet.freeze_panes = copy(source_sheet.freeze_panes)

    # set row dimensions
    # So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
    for rn in range(len(source_sheet.row_dimensions)):
        target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])

    if source_sheet.sheet_format.defaultColWidth is None:
        print('Unable to copy default column wide')
    else:
        target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)

    # set specific column width and hidden property
    # we cannot copy the entire column_dimensions attribute so we copy selected attributes
    for key, value in source_sheet.column_dimensions.items():
        target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min)   # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
        target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max)  # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
        target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
        target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)


def copy_cells(source_sheet, target_sheet):
    for (row, col), source_cell in source_sheet._cells.items():
        target_cell = target_sheet.cell(column=col, row=row)

        target_cell._value = source_cell._value
        target_cell.data_type = source_cell.data_type

        if source_cell.has_style:
            target_cell.font = copy(source_cell.font)
            target_cell.border = copy(source_cell.border)
            target_cell.fill = copy(source_cell.fill)
            target_cell.number_format = copy(source_cell.number_format)
            target_cell.protection = copy(source_cell.protection)
            target_cell.alignment = copy(source_cell.alignment)

        if source_cell.hyperlink:
            target_cell._hyperlink = copy(source_cell.hyperlink)

        if source_cell.comment:
            target_cell.comment = copy(source_cell.comment)


wb_target = openpyxl.Workbook()
target_sheet = wb_target.create_sheet(..sheet_name..)

wb_source = openpyxl.load_workbook(..path\\+\\file_name.., data_only=True)
source_sheet = wb_source[..sheet_name..]

copy_sheet(source_sheet, target_sheet)

if 'Sheet' in wb_target.sheetnames:  # remove default sheet
    wb_target.remove(wb_target['Sheet'])

wb_target.save('out.xlsx')

Solution 2

i found a way playing around with it

import openpyxl

xl1 = openpyxl.load_workbook('workbook1.xlsx')
# sheet you want to copy
s = openpyxl.load_workbook('workbook2.xlsx').active
s._parent = xl1
xl1._add_sheet(s)
xl1.save('some_path/name.xlsx')

Solution 3

You cannot use copy_worksheet() to copy between workbooks because it depends on global constants that may vary between workbooks. The only safe and reliable way to proceed is to go row-by-row and cell-by-cell.

You might want to read the discussions about this feature

Solution 4

My workaround goes like this:

You have a template file let's say it's "template.xlsx". You open it, make changes to it as needed, save it as a new file, close the file. Repeat as needed. Just make sure to keep a copy of the original template while testing/messing around.

Solution 5

I had a similar requirement to collate data from multiple workbooks into one workbook. As there are no inbuilt methods available in openpyxl.

I created the below script to do the job for me.

Note: In my usecase all worbooks contain data in same format.

from openpyxl import load_workbook
import os


# The below method is used to read data from an active worksheet and store it in memory.
def reader(file):
    global path
    abs_file = os.path.join(path, file)
    wb_sheet = load_workbook(abs_file).active
    rows = []
    # min_row is set to 2, to ignore the first row which contains the headers
    for row in wb_sheet.iter_rows(min_row=2):
        row_data = []
        for cell in row:
            row_data.append(cell.value)
        # custom column data I am adding, not needed for typical use cases
        row_data.append(file[17:-6])
        # Creating a list of lists, where each list contain a typical row's data
        rows.append(row_data)
    return rows


if __name__ == '__main__':
    # Folder in which my source excel sheets are present
    path = r'C:\Users\tom\Desktop\Qt'
    # To get the list of excel files
    files = os.listdir(path)
    for file in files:
        rows = reader(file)
        # below mentioned file name should be already created
        book = load_workbook('new.xlsx')
        sheet = book.active
        for row in rows:
            sheet.append(row)
        book.save('new.xlsx')
Share:
32,626

Related videos on Youtube

idillus
Author by

idillus

Updated on July 09, 2022

Comments

  • idillus
    idillus almost 2 years

    I have a large amount of EXCEL files (i.e. 200) I would like to copy one specific worksheet from one workbook to another one. I have done some investigations and I couldn't find a way of doing it with Openpyxl

    This is the code I have developed so far

    def copy_sheet_to_different_EXCEL(path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name):
    ''' Function used to copy one EXCEL sheet into another file.
        
        def path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name
        
    Input data:
        1.) path_EXCEL_read: the location of the EXCEL file along with the name where the information is going to be saved
        2.) Sheet_name_to_copy= The name of the EXCEL sheet to copy
        3.) path_EXCEL_Save: The path of the EXCEL file where the sheet is going to be copied
        3.) Sheet_new_name: The name of the new EXCEL sheet
        
    Output data:
        1.) Status= If 0, everything went OK. If 1, one error occurred.
    
    Version History:
    1.0 (2017-02-20): Initial version.
    
    '''
    status=0
    
    if(path_EXCEL_read.endswith('.xls')==1): 
        print('ERROR - EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
        status=1
        return status
        
    try:
       wb = openpyxl.load_workbook(path_EXCEL_read,read_only=True)
    except:
        print('ERROR - EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
        status=1
        return status
    
    Sheet_names=wb.get_sheet_names()    # We copare against the sheet name we would like to cpy
    
    if ((Sheet_name_to_copy in Sheet_names)==0):
        print('ERROR - EXCEL sheet does not exist'.format(Sheet_name_to_copy))
        status=1
        return status   
    
    # We checking if the destination file exists
    
    if (os.path.exists(path_EXCEL_Save)==1):
        #If true, file exist so we open it
        
        if(path_EXCEL_Save.endswith('.xls')==1): 
            print('ERROR - Destination EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
            status=1
        return status
        
        try:
            wdestiny = openpyxl.load_workbook(path_EXCEL_Save)
        except:
            print('ERROR - Destination EXCEL file does not exist in the following location:\n  {0}'.format(path_EXCEL_read))
            status=1
        return status
    
        #we check if the destination sheet exists. If so, we will delete it
        
        destination_list_sheets = wdestiny.get_sheet_names()
        
        if((Sheet_new_name in destination_list_sheets) ==True):
            print('WARNING - Sheet "{0}" exists in: {1}. It will be deleted!'.format(Sheet_new_name,path_EXCEL_Save))
            wdestiny.remove_sheet(Sheet_new_name) 
    
    else:
        wdestiny=openpyxl.Workbook()
    # We copy the Excel sheet
        
    try:
        sheet_to_copy = wb.get_sheet_by_name(Sheet_name_to_copy) 
        target = wdestiny.copy_worksheet(sheet_to_copy)
        target.title=Sheet_new_name
    except:
        print('ERROR - Could not copy the EXCEL sheet. Check the file')
        status=1
        return status
    
    try:
        wdestiny.save(path_EXCEL_Save)
    except:
        print('ERROR - Could not save the EXCEL sheet. Check the file permissions')
        status=1
        return status
    
    #Program finishes
    return status
    
  • Vineesh TP
    Vineesh TP over 4 years
    Missing cell formats, if cell have muliple formats for text(Richtext)
  • Filippos Zofakis
    Filippos Zofakis over 4 years
    This answer just helps with removing the old worksheet, not copying over the new one
  • C. S. F. Junior
    C. S. F. Junior almost 4 years
    This answer just modify the original and creates a copy with another name/path. What if I don't want to update the original?
  • Ajay Kumar
    Ajay Kumar over 2 years
    This works for me as my expectation...thank you so much
  • Lead Developer
    Lead Developer over 2 years
    Great. it's working like a charm.
  • z33k
    z33k over 2 years
    Didn't work for me. I get write_stylesheet; xf.alignment = wb._alignments[style.alignmentId]; IndexError: list index out of range on save.
  • z33k
    z33k over 2 years
    Works. If only it also handled images...
  • z33k
    z33k over 2 years
    Update: The only thing you have to do for this magic to handle images is: add line target_sheet._images = copy(source_sheet._images) after target_sheet.freeze_panes = copy(source_sheet.freeze_panes) line in copy_sheet_attributes. Splendid! BTW, this is the second case where I wonder on the obscure reasons (see, I restrained myself and hasn't said dumb) openpyxl devs are hiding features before users. For the first case look here.
  • SurpriseDog
    SurpriseDog over 2 years
    Thanks! It works well in excel, but when I open the document in LibreOffice I get the error: "The data could not be loaded completely because the maximum number of rows per sheet was exceeded." - but then it proceeds to load the output file just fine so ¯_(ツ)_/¯
  • SurpriseDog
    SurpriseDog over 2 years
    Can you expand this answer with the source code you used?
  • tpk
    tpk about 2 years
    Thank you!!!!!!!
  • Michael Dausmann
    Michael Dausmann about 2 years
    not a bad try but my copied sheet comes out pretty wierd, formatting etc seems random. I am instead going with the solution below from Oscar
  • Corvax
    Corvax almost 2 years
    To copy auto_filter parameters you need to add target_sheet.auto_filter = copy(source_sheet.auto_filter) to copy_sheet_attributes
  • Corvax
    Corvax almost 2 years
    To optimize memory using better to use source_sheet._cells.iteritems() rather than source_sheet._cells.items().