How to keep style format unchanged after writing data using OpenPyXL package in Python?

16,128

Solution 1

According to the answer to this question, you can format cells in Excel using openpyxl.

The answer given there only changes the target cell to bold, but maybe you can change the font face back to lemons1.

from openpyxl.workbook import Workbook
from openpyxl.styles import Font
wb = Workbook()
ws = wb.active
ws['B3'] = "Hello"
ws['B3'].font =  Font(name='lemons1', size=14)
wb.save("FontDemo.xlsx")

However, according to the documentation, you can only apply styles to whole cells, not to part of a cell. So you would need to put the Khmer characters in one cell and the English characters in another cell.

Solution 2

Excel file (with .xlsx extension) is actually zip archive. (You can actually open excel file with 7-zip or some similar program.) So excel file contains bunch of xml files with data stored in them. What openpyxl does, is reading data from these xml files when opening excel file and creating zip archive with xml files when saving excel file. Simply sad, openpyxl reads some xml files, then it parses that data, than you can use funtions in openpyxl library to change and add data, and finally when you save your workbook, openpyxl will create xml files, write data to them, and save them as zip archive (which is excel file). These xml files contain all data stored in excel file, (one xml file contains formulas from excel file, other will contain styles, in other there will be data about excel theme and so on). We only care about strings in excel file which are stored in two xml file:

  • sharedStrings.xml

    This file contains all strings in excel file and formatting of those strings, here is an example:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
     <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">
         <si>
             <r>
                 <rPr>
                     <b/>
                     <sz val="22"/>
                     <color theme="1"/>
                     <rFont val="Calibri"/>
                     <family val="2"/>
                     <scheme val="minor"/>
                 </rPr>
                 <t>Hello</t>
             </r>
             <r>
                 <rPr>
                     <sz val="22"/>
                     <color theme="1"/>
                     <rFont val="Calibri"/>
                     <family val="2"/>
                     <scheme val="minor"/>
                 </rPr>
                 <t xml:space="preserve"> ត</t>
             </r>
         </si>
     </sst>
    
  • sheet1.xml

    This file contains position of your strings (which cell contains which string). (There will be one file for each sheet in you excel file, but let's say you have only one sheet in your file for purpose of this example.) Here is an example:

     <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
     <worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xr:uid="{00000000-0001-0000-0000-000000000000}">
         <dimension ref="A1:C3"/>
         <sheetViews>
             <sheetView tabSelected="1" zoomScaleNormal="100" workbookViewId="0">
                 <selection activeCell="A3" sqref="A3"/>
             </sheetView>
         </sheetViews>
         <sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>
         <cols>
             <col min="1" max="1" width="20.140625" customWidth="1"/>
             <col min="2" max="2" width="10.7109375" customWidth="1"/>
         </cols>
         <sheetData>
             <row r="1" spans="1:3" ht="60.75" customHeight="1" x14ac:dyDescent="0.45">
                 <c r="A1" s="4" t="s">
                     <v>0</v>
                 </c>
             </row>
             <row r="2" spans="1:3" ht="19.5" customHeight="1" x14ac:dyDescent="0.35">
                 <c r="A2" s="1"/>
                 <c r="B2" s="3"/>
             </row>
             <row r="3" spans="1:3" ht="62.25" customHeight="1" x14ac:dyDescent="0.5">
                 <c r="A3" s="5" t="s">
                     <v>1</v>
                 </c>
                 <c r="C3" s="2"/>
             </row>
         </sheetData>
         <pageMargins left="0.75" right="0.75" top="1" bottom="1" header="0.5" footer="0.5"/>
         <pageSetup paperSize="9" orientation="portrait" r:id="rId1"/>
     </worksheet>
    

IF you open this excel with openpyxl and than save it (without changing any data), this is what sharedStrings.xml will look like:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">
    <si>
        <t>Hello &#6031;</t>
    </si>
    <si>
        <t>&#6030; sike</t>
    </si>
</sst>

As you can see you will lose all cell (strings) original formatting and instead you will get some sort of merged formatting for your cells (So if some characters in cell are bold and some are not, then when you save file, either whole cell will be bold or whole cell will be normal). Now people had asked developers to implement this rich text option (link1, link2) but they sad that it would be complicated to implement something like this. I agree that this wouldn't be easy to do, but we can do something simpler: We can get data from sharedStrings.xml when we are opening excel file, and than use that xml code when we want to save excel file but only for cells that existed when we were opening file. This is probably not easy to understand so lets see following example:

Let's say you have excel file like this: enter image description here

For this excel file, sharedStrings.xml will be this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">
    <si>
        <r>
            <rPr>
                <b/>
                <sz val="22"/>
                <color theme="1"/>
                <rFont val="Calibri"/>
                <family val="2"/>
                <scheme val="minor"/>
            </rPr>
            <t>Hello</t>
        </r>
        <r>
            <rPr>
                <sz val="22"/>
                <color theme="1"/>
                <rFont val="Calibri"/>
                <family val="2"/>
                <scheme val="minor"/>
            </rPr>
            <t xml:space="preserve"> ត</t>
        </r>
    </si>
</sst>

If you run this python code:

from openpyxl import load_workbook
workbook = load_workbook(FILENAME1, keep_links=False)
sheet = workbook.active
sheet['A2'] = 'It is me'
workbook.save('out.xlsx')

File out.xlsx will look like this:

enter image description here

For out.xlsx file, sharedStrings.xml will be this:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">
    <si>
        <t>Hello &#6031;</t>
    </si>
    <si>
        <t>It is me</t>
    </si>
</sst>

So what we want to do is use this xml code:

<si>
    <r>
        <rPr>
            <b/>
            <sz val="22"/>
            <color theme="1"/>
            <rFont val="Calibri"/>
            <family val="2"/>
            <scheme val="minor"/>
        </rPr>
        <t>Hello</t>
    </r>
    <r>
        <rPr>
            <sz val="22"/>
            <color theme="1"/>
            <rFont val="Calibri"/>
            <family val="2"/>
            <scheme val="minor"/>
        </rPr>
        <t xml:space="preserve"> ត</t>
    </r>
</si>

for old cell A1 which contains Hello ត and this xml code:

<si>
    <t>It is me</t>
</si>

for new cell A2 which contains It is me.

So we can combine this xml parts to get xml file like this:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">
    <si>
        <r>
            <rPr>
                <b/>
                <sz val="22"/>
                <color theme="1"/>
                <rFont val="Calibri"/>
                <family val="2"/>
                <scheme val="minor"/>
            </rPr>
            <t>Hello</t>
        </r>
        <r>
            <rPr>
                <sz val="22"/>
                <color theme="1"/>
                <rFont val="Calibri"/>
                <family val="2"/>
                <scheme val="minor"/>
            </rPr>
            <t xml:space="preserve"> ត</t>
        </r>
    </si>
    <si>
        <t>It is me</t>
    </si>
</sst>

I wrote some functions to be able to this. (There is quite a lot of code, but most of it is just copied from openpyxl. If you would change openpyxl library you could do this in 10 or 20 lines of code, but thats never good idea so I rather copied whole functions that i needed to change and then changed that small part.)

You can save following code in separate file extendedopenpyxl.py:

from openpyxl import load_workbook as openpyxlload_workbook
from openpyxl.reader.excel import _validate_archive, _find_workbook_part
from openpyxl.reader.worksheet import _get_xml_iter
from openpyxl.xml.functions import fromstring, iterparse, safe_iterator, tostring, Element, xmlfile, SubElement
from openpyxl.xml.constants import ARC_CONTENT_TYPES, SHEET_MAIN_NS, SHARED_STRINGS, ARC_ROOT_RELS, ARC_APP, ARC_CORE, ARC_THEME, ARC_SHARED_STRINGS, ARC_STYLE, ARC_WORKBOOK, ARC_WORKBOOK_RELS
from openpyxl.packaging.manifest import Manifest
from openpyxl.packaging.relationship import get_dependents, get_rels_path
from openpyxl.packaging.workbook import WorkbookParser
from openpyxl.packaging.extended import ExtendedProperties
from openpyxl.utils import coordinate_to_tuple
from openpyxl.cell.text import Text
from openpyxl.writer.excel import ExcelWriter as openpyxlExcelWriter
from openpyxl.writer.workbook import write_root_rels, write_workbook_rels, write_workbook
from openpyxl.writer.theme import write_theme
from openpyxl.writer.etree_worksheet import get_rows_to_write
from openpyxl.styles.stylesheet import write_stylesheet
from zipfile import ZipFile, ZIP_DEFLATED
from operator import itemgetter
from io import BytesIO
from xml.etree.ElementTree import tostring as xml_tostring
from xml.etree.ElementTree import register_namespace
from lxml.etree import fromstring as lxml_fromstring

register_namespace('', 'http://schemas.openxmlformats.org/spreadsheetml/2006/main')

def get_value_cells(workbook):
    value_cells = []
    for idx, worksheet in enumerate(workbook.worksheets, 1):
        all_rows = get_rows_to_write(worksheet)
        for row_idx, row in all_rows:
            row = sorted(row, key=itemgetter(0))
            for col, cell in row:
                if cell._value is not None:
                    if cell.data_type == 's':
                        value_cells.append((worksheet.title,(cell.row, cell.col_idx)))
    return value_cells

def check_if_lxml(element):
    if type(element).__module__ == 'xml.etree.ElementTree':
        string = xml_tostring(element)
        el = lxml_fromstring(string)
        return el
    return element

def write_string_table(workbook):
    string_table = workbook.shared_strings
    workbook_data = workbook.new_interal_value_workbook_data
    data_strings = workbook.new_interal_value_data_strings
    value_cells = get_value_cells(workbook)
    out = BytesIO()
    i = 0
    with xmlfile(out) as xf:
        with xf.element("sst", xmlns=SHEET_MAIN_NS, uniqueCount="%d" % len(string_table)):            
            for i, key in enumerate(string_table):
                sheetname, coordinates = value_cells[i]
                if coordinates in workbook_data[sheetname]:
                    value = workbook_data[sheetname][coordinates]
                    xml_el = data_strings[value]
                    el = check_if_lxml(xml_el)
                else:
                    el = Element('si')
                    text = SubElement(el, 't')
                    text.text = key
                    if key.strip() != key:
                        text.set(PRESERVE_SPACE, 'preserve')
                xf.write(el)

    return  out.getvalue()


class ExcelWriter(openpyxlExcelWriter):

    def write_data(self):
        """Write the various xml files into the zip archive."""
        # cleanup all worksheets
        archive = self._archive

        archive.writestr(ARC_ROOT_RELS, write_root_rels(self.workbook))
        props = ExtendedProperties()
        archive.writestr(ARC_APP, tostring(props.to_tree()))

        archive.writestr(ARC_CORE, tostring(self.workbook.properties.to_tree()))
        if self.workbook.loaded_theme:
            archive.writestr(ARC_THEME, self.workbook.loaded_theme)
        else:
            archive.writestr(ARC_THEME, write_theme())

        self._write_worksheets()
        self._write_chartsheets()
        self._write_images()
        self._write_charts()

        string_table_out = write_string_table(self.workbook)
        self._archive.writestr(ARC_SHARED_STRINGS, string_table_out)
        self._write_external_links()

        stylesheet = write_stylesheet(self.workbook)
        archive.writestr(ARC_STYLE, tostring(stylesheet))

        archive.writestr(ARC_WORKBOOK, write_workbook(self.workbook))
        archive.writestr(ARC_WORKBOOK_RELS, write_workbook_rels(self.workbook))

        self._merge_vba()

        self.manifest._write(archive, self.workbook)

        return

    def save(self, filename):
        self.write_data()
        self._archive.close()
        return
    

def get_coordinates(cell, row_count, col_count):
    coordinate = cell.get('r')
    if coordinate:
        row, column = coordinate_to_tuple(coordinate)
    else:
        row, column = row_count, col_count
    return row, column

def parse_cell(cell):
    VALUE_TAG = '{%s}v' % SHEET_MAIN_NS
    value = cell.find(VALUE_TAG)
    if value is not None:
        value = int(value.text)
    return value

def parse_row(row, row_count):
    CELL_TAG = '{%s}c' % SHEET_MAIN_NS
    if row.get('r'):
        row_count = int(row.get('r'))
    else:
        row_count += 1
    col_count = 0
    data = dict()
    for cell in safe_iterator(row, CELL_TAG):
        col_count += 1
        value = parse_cell(cell)
        if value is not None:
            coordinates = get_coordinates(cell, row_count, col_count)
            data[coordinates] = value
    return data


def parse_sheet(xml_source):
    dispatcher = ['{%s}mergeCells' % SHEET_MAIN_NS, '{%s}col' % SHEET_MAIN_NS, '{%s}row' % SHEET_MAIN_NS, '{%s}conditionalFormatting' % SHEET_MAIN_NS, '{%s}legacyDrawing' % SHEET_MAIN_NS, '{%s}sheetProtection' % SHEET_MAIN_NS, '{%s}extLst' % SHEET_MAIN_NS, '{%s}hyperlink' % SHEET_MAIN_NS, '{%s}tableParts' % SHEET_MAIN_NS]
    row_count = 0
    stream = _get_xml_iter(xml_source)
    it = iterparse(stream, tag=dispatcher)
    row_tag = '{%s}row' % SHEET_MAIN_NS
    data = dict()
    for _, element in it:
        tag_name = element.tag
        if tag_name == row_tag:
            row_data = parse_row(element, row_count)
            data.update(row_data)
            element.clear()
    return data


def get_workbook_parser(archive):
    src = archive.read(ARC_CONTENT_TYPES)
    root = fromstring(src)
    package = Manifest.from_tree(root)
    wb_part = _find_workbook_part(package)
    workbook_part_name = wb_part.PartName[1:]
    parser = WorkbookParser(archive, workbook_part_name)
    parser.parse()
    return parser, package

def get_data_strings(xml_source):
    STRING_TAG = '{%s}si' % SHEET_MAIN_NS
    strings = []
    src = _get_xml_iter(xml_source)
    for _, node in iterparse(src):
        if node.tag == STRING_TAG:
            strings.append(node)

    return strings

def load_workbook(filename, *args, **kwargs):
    workbook = openpyxlload_workbook(filename, *args, **kwargs)

    archive = _validate_archive(filename)
    parser, package = get_workbook_parser(archive)

    workbook_data = dict()
    for sheet, rel in parser.find_sheets():
        sheet_name = sheet.name
        worksheet_path = rel.target
        fh = archive.open(worksheet_path)
        sheet_data = parse_sheet(fh)
        workbook_data[sheet_name] = sheet_data

    data_strings = []
    ct = package.find(SHARED_STRINGS)
    if ct is not None:
        strings_path = ct.PartName[1:]
        strings_source = archive.read(strings_path)
        data_strings = get_data_strings(strings_source)

    workbook.new_interal_value_workbook_data = workbook_data
    workbook.new_interal_value_data_strings = data_strings
    return workbook

def save_workbook(workbook, filename,):
    archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)
    writer = ExcelWriter(workbook, archive)
    writer.save(filename)
    return True

def save_virtual_workbook(workbook,):
    temp_buffer = BytesIO()
    archive = ZipFile(temp_buffer, 'w', ZIP_DEFLATED, allowZip64=True)
    writer = ExcelWriter(workbook, archive)
    try:
        writer.write_data()
    finally:
        archive.close()
    virtual_workbook = temp_buffer.getvalue()
    temp_buffer.close()
    return virtual_workbook

And now if your run this code:

from extendedopenpyxl import load_workbook, save_workbook

workbook = load_workbook(FILENAME1, keep_links=False)
sheet = workbook['Sheet']
sheet['A2'] = 'It is me'
save_workbook(workbook, 'out.xlsx')

When I run this code on excel file that I used in example above I got this result:

enter image description here

As you can see text in cell A1 is formatted as it was, (Hello is in bold, and is not).

Edit (after West's comment)

If you are using a newer version of openpyxl (higher than 2.5.14) then the code above won't work because openpyxl completely changed how it stores values in an excel file. I fixed parts of code inside extendedopenpyxl.py and following code should work with newer versions of openpyxl (I tested it on version 3.0.6):

from openpyxl.reader.excel import ExcelReader, _validate_archive
from openpyxl.xml.constants import SHEET_MAIN_NS, SHARED_STRINGS, ARC_SHARED_STRINGS, ARC_APP, ARC_CORE, ARC_THEME, ARC_STYLE, ARC_ROOT_RELS, ARC_WORKBOOK, ARC_WORKBOOK_RELS
from openpyxl.xml.functions import iterparse, xmlfile, tostring
from openpyxl.utils import coordinate_to_tuple
import openpyxl.cell._writer
from zipfile import ZipFile, ZIP_DEFLATED
from openpyxl.writer.excel import ExcelWriter
from io import BytesIO
from xml.etree.ElementTree import register_namespace
from xml.etree.ElementTree import tostring as xml_tostring
from lxml.etree import fromstring as lxml_fromstring
from openpyxl.worksheet._writer import WorksheetWriter
from openpyxl.workbook._writer import WorkbookWriter
from openpyxl.packaging.extended import ExtendedProperties
from openpyxl.styles.stylesheet import write_stylesheet
from openpyxl.packaging.relationship import Relationship
from openpyxl.cell._writer import write_cell
from openpyxl.drawing.spreadsheet_drawing import SpreadsheetDrawing
from openpyxl import LXML
from openpyxl.packaging.manifest import DEFAULT_OVERRIDE, Override, Manifest


DEFAULT_OVERRIDE.append(Override("/" + ARC_SHARED_STRINGS, SHARED_STRINGS))


def to_integer(value):
    if type(value) == int:
        return value
    if type(value) == str:
        try:
            num = int(value)
            return num
        except ValueError:
            num = float(value)
            if num.is_integer():
                return int(num)
    raise ValueError('Value {} is not an integer.'.format(value))
    return

def parse_cell(cell):
    VALUE_TAG = '{%s}v' % SHEET_MAIN_NS
    data_type = cell.get('t', 'n')
    value = None
    if data_type == 's':
        value = cell.findtext(VALUE_TAG, None) or None
        if value is not None:
            value = int(value)
    return value

def get_coordinates(cell, row_counter, col_counter):
    coordinate = cell.get('r')
    if coordinate:
        row, column = coordinate_to_tuple(coordinate)
    else:
        row, column = row_counter, col_counter
    return row, column

def parse_row(row, row_counter):
    row_counter = to_integer(row.get('r', row_counter + 1))
    col_counter = 0
    data = dict()
    for cell in row:
        col_counter += 1
        value = parse_cell(cell)
        if value is not None:
            coordinates = get_coordinates(cell, row_counter, col_counter)
            data[coordinates] = value
            col_counter = coordinates[1]
    return data, row_counter

def parse_sheet(xml_source):
    ROW_TAG = '{%s}row' % SHEET_MAIN_NS
    row_counter = 0
    it = iterparse(xml_source)
    data = dict()
    for _, element in it:
        tag_name = element.tag
        if tag_name == ROW_TAG:
            pass
            row_data, row_counter = parse_row(element, row_counter)
            data.update(row_data)
            element.clear()
    return data

def extended_archive_open(archive, name):
    with archive.open(name,) as src:
        namespaces = {node[0]: node[1] for _, node in
                      iterparse(src, events=['start-ns'])}
    for key, value in namespaces.items():
        register_namespace(key, value)
    return archive.open(name,)

def get_data_strings(xml_source):
    STRING_TAG = '{%s}si' % SHEET_MAIN_NS
    strings = []
    for _, node in iterparse(xml_source):
        if node.tag == STRING_TAG:
            strings.append(node)
    return strings
    
def load_workbook(filename, read_only=False, keep_vba=False,
                  data_only=False, keep_links=True):
    reader = ExcelReader(filename, read_only, keep_vba,
                        data_only, keep_links)
    reader.read()

    archive = _validate_archive(filename)
    
    workbook_data = dict()
    for sheet, rel in reader.parser.find_sheets():
        if rel.target not in reader.valid_files or "chartsheet" in rel.Type:
            continue
        fh = archive.open(rel.target)
        sheet_data = parse_sheet(fh)
        workbook_data[sheet.name] = sheet_data

    data_strings = []
    ct = reader.package.find(SHARED_STRINGS)
    if ct is not None:
        strings_path = ct.PartName[1:]
        with extended_archive_open(archive, strings_path) as src:
            data_strings = get_data_strings(src)

    archive.close()

    workbook = reader.wb
    workbook._extended_value_workbook_data = workbook_data
    workbook._extended_value_data_strings = data_strings
    return workbook

def check_if_lxml(element):
    if type(element).__module__ == 'xml.etree.ElementTree':
        string = xml_tostring(element)
        el = lxml_fromstring(string)
        return el
    return element

def write_string_table(workbook):
    workbook_data = workbook._extended_value_workbook_data
    data_strings = workbook._extended_value_data_strings
    out = BytesIO()
    with xmlfile(out) as xf:
        with xf.element("sst", xmlns=SHEET_MAIN_NS, uniqueCount="%d" % len(data_strings)):
            for sheet in workbook_data:
                for coordinates, value in workbook_data[sheet].items():
                    xml_el = data_strings[value]
                    el = check_if_lxml(xml_el)
                    xf.write(el)
    return out.getvalue()

def check_cell(cell):
    if cell.data_type != 's':
        return False
    if cell._comment is not None:
        return False
    if cell.hyperlink:
        return False
    return True

def extended_write_cell(xf, worksheet, cell, styled=None):
        workbook_data = worksheet.parent._extended_value_workbook_data
        for sheet in workbook_data.values():
            if (cell.row, cell.column) in sheet and check_cell(cell):
                attributes = {'r': cell.coordinate, 't': cell.data_type}
                if styled:
                    attributes['s'] = '%d' % cell.style_id
                if LXML:
                    with xf.element('c', attributes):
                        with xf.element('v'):
                            xf.write('%.16g' % sheet[(cell.row, cell.column)])                    
                else:
                    el = Element('c', attributes)
                    cell_content = SubElement(el, 'v')
                    cell_content.text = '%.16g' % sheet[(cell.row, cell.column)]
                    xf.write(el)
                break
        else:
            write_cell(xf, worksheet, cell, styled)
        return

class ExtendedWorksheetWriter(WorksheetWriter):

    def write_row(self, xf, row, row_idx):
        attrs = {'r': f"{row_idx}"}
        dims = self.ws.row_dimensions
        attrs.update(dims.get(row_idx, {}))

        with xf.element("row", attrs):

            for cell in row:
                if cell._comment is not None:
                    comment = CommentRecord.from_cell(cell)
                    self.ws._comments.append(comment)
                if (
                    cell._value is None
                    and not cell.has_style
                    and not cell._comment
                    ):
                    continue
                extended_write_cell(xf, self.ws, cell, cell.has_style)
        return


class ExtendedWorkbookWriter(WorkbookWriter):

    def write_rels(self, *args, **kwargs):
        styles =  Relationship(type='sharedStrings', Target='sharedStrings.xml')
        self.rels.append(styles)
        return super().write_rels(*args, **kwargs) 

class ExtendedExcelWriter(ExcelWriter):

    def __init__(self, workbook, archive):
        self._archive = archive
        self.workbook = workbook
        self.manifest = Manifest(Override = DEFAULT_OVERRIDE)
        self.vba_modified = set()
        self._tables = []
        self._charts = []
        self._images = []
        self._drawings = []
        self._comments = []
        self._pivots = []
        return

    def write_data(self):
        archive = self._archive
        props = ExtendedProperties()
        archive.writestr(ARC_APP, tostring(props.to_tree()))
        archive.writestr(ARC_CORE, tostring(self.workbook.properties.to_tree()))
        if self.workbook.loaded_theme:
            archive.writestr(ARC_THEME, self.workbook.loaded_theme)
        else:
            archive.writestr(ARC_THEME, theme_xml)
        self._write_worksheets()
        self._write_chartsheets()
        self._write_images()
        self._write_charts()

        if self.workbook._extended_value_workbook_data \
                and self.workbook._extended_value_data_strings:
            string_table_out = write_string_table(self.workbook)
            self._archive.writestr(ARC_SHARED_STRINGS, string_table_out)
        
        self._write_external_links()
        stylesheet = write_stylesheet(self.workbook)
        archive.writestr(ARC_STYLE, tostring(stylesheet))
        
        writer = ExtendedWorkbookWriter(self.workbook)
        
        archive.writestr(ARC_ROOT_RELS, writer.write_root_rels())
        archive.writestr(ARC_WORKBOOK, writer.write())
        archive.writestr(ARC_WORKBOOK_RELS, writer.write_rels())
        self._merge_vba()
        self.manifest._write(archive, self.workbook)
        return

    def write_worksheet(self, ws):
        ws._drawing = SpreadsheetDrawing()
        ws._drawing.charts = ws._charts
        ws._drawing.images = ws._images
        if self.workbook.write_only:
            if not ws.closed:
                ws.close()
            writer = ws._writer
        else:
            writer = ExtendedWorksheetWriter(ws)
            writer.write()

        ws._rels = writer._rels
        self._archive.write(writer.out, ws.path[1:])
        self.manifest.append(ws)
        writer.cleanup()
        return

def save_workbook(workbook, filename):
    archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)
    writer = ExtendedExcelWriter(workbook, archive)
    writer.save()
    return True
Share:
16,128

Related videos on Youtube

Houy Narun
Author by

Houy Narun

I'm a programmer and developer using languages such C, VB, HTML, CSS, Jascript and python.

Updated on June 04, 2022

Comments

  • Houy Narun
    Houy Narun almost 2 years

    I'm using openpyxl library package to read and write some data to an existing excel file test.xlsx.

    Before writing some data to it, the content of file look like this:

    enter image description here

    • cell A1 is contain Khmer Unicode character, and English character is in Bold style.

    • cell A3 used font lemons1 font-face, and English character is in Italic style.

    I was using the script below to read and write data "It is me" to cell B2 of this excel file:

    from openpyxl import load_workbook
    import os
    FILENAME1 = os.path.dirname(__file__)+'/test.xlsx'
    from flask import make_response
    from openpyxl.writer.excel import save_virtual_workbook
    from app import app
    
    @app.route('/testexel', methods=['GET'])
    def testexel():
        with app.app_context():
            try:
                filename = 'test'
                workbook = load_workbook(FILENAME1, keep_links=False)
                sheet = workbook['Sheet1']
                sheet['B2']='It is me'
    
                response = make_response(save_virtual_workbook(workbook))
                response.headers['Cache-Control'] = 'no-cache'
                response.headers["Content-Disposition"] = "attachment; filename=%s.xlsx" % filename
                response.headers["Content-type"] = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; charset=utf-8"
    
                return response
    
            except Exception as e:
    
                raise
    

    Then format of resulted excel file was modified as this, which I've never wanted it to be like this :

    enter image description here

    The formatting style is quite different from the original file before writing data to it:

    • cell A1 all data is all bold taking style format from English character

    • cell B3 English character became a normal style, and font was change to font-face limons1 taking from that of khmer character in front of it.

    What I am trying to accomplish is to keep existing content of file in the same format (style and font-face) as it was, while writing additional data to it.

    Please kindly advise what is wrong with my script and how can I do to keep existing style and font-face unchanged after running above script? Thanks.

    • Charlie Clark
      Charlie Clark over 5 years
    • Houy Narun
      Houy Narun over 5 years
      @CharlieClark why duplicate? if the provide link does seem exactly solve the problem yet.
    • Charlie Clark
      Charlie Clark over 5 years
      The question is duplicate and the answer hasn't changed.
    • Houy Narun
      Houy Narun over 5 years
      @CharlieClark the answer seem not address my problem, at least. I’ve searched around some similar questioned, yet none suggested an answer with code to achieve this desire. Therefore I’m looking for something that workaround, at least a reason why it is not possible and which is alternative, avoiding one sentence answer “openpyxls does not support multiple format in one cell” it’s helpless, thanks :)
    • Charlie Clark
      Charlie Clark over 5 years
      What you want cannot be done with openpyxl. That is the answer.
    • Houy Narun
      Houy Narun over 5 years
      @CharlieClark if it is not possible, could you provide me an alternative solution, can be another package? Really appreciate your solution, Thanks
    • Yvonne Aburrow
      Yvonne Aburrow over 5 years
      @HouyNarun what happens if you write the new data to a cell after the existing content, such as cell B4 ?
    • Houy Narun
      Houy Narun over 5 years
      @YvonneAburrow in reality I am given an excel file that has already formatted cell style and different font only some cells that I have to write data to it, so after data to cells I am supposed to write to, the other cell that I am not supposed to touch they also get affected, this is my problem, thanks
    • Yvonne Aburrow
      Yvonne Aburrow over 5 years
      It looks like they might need to give you a different Excel file then. :(
    • Houy Narun
      Houy Narun over 5 years
      @YvonneAburrow Yes, my task is to write data to the given Excel file while keeping other things untouched.
    • Houy Narun
      Houy Narun over 5 years
      @YvonneAburrow I tried to search around and I found some other people facing the same problem like me with OpenPyXl but I did not found answer to their problem.
  • Houy Narun
    Houy Narun over 5 years
    Beside would you like to recommend anything I can try as an alternative solution, something like another package, library that I can achieve above result ? Really appreciated thanks
  • Yvonne Aburrow
    Yvonne Aburrow over 5 years
    sorry, I had never even heard of this package till I saw your post. You might like to investigate the APIs for interacting with Google Spreadsheets?
  • Admin
    Admin over 4 years
    @Yvonne can you help here stackoverflow.com/questions/59888367/…
  • Yvonne Aburrow
    Yvonne Aburrow over 4 years
    I had a look, but the one answer that was there seemed to make a lot of sense.
  • West
    West over 3 years
    I got an error No module named 'openpyxl.reader.worksheet
  • ands
    ands over 3 years
    @West, I have edited my answer, if you use new code for the extendedopenpyxl.py file, your program should work.
  • West
    West over 3 years
    I'm on python 3.6.5 and there was a couple of errors with the new code, one being ValueError: invalid literal for int() with base 10 with line 46, in parse_cell value = int(value) and also I'm not not able to use the Zipfile.seek() method but I know that needs a newer python version. Will see how I go with a newer python. Thanks
  • ands
    ands over 3 years
    Can you add line print([value], type(value)) before line 46 value = int(value) and tell me what does it print out?
  • West
    West over 3 years
    Type of string. The value is a text value from cell B1 of the first sheet. For example value in cell B1 is "Tax Amount", error picks the value as "Tax ". Did you successfully test the solution on your end?
  • ands
    ands over 3 years
    As far as I know value tag (<v>0</v>) in sheet1.xml should contain only integers. Yes, i tested it on this file. Can you create a copy of your file, delete everything from the file except cell B1 and then test python code and if it shows the same error then send me that file?
  • West
    West over 3 years
    Okay. I've also tested your file and it fails if I put some random value like 13.57 in cell B1 (or any other cell)
  • ands
    ands over 3 years
    I fixed the problem with value tag. (I forgot to check that only value tags inside cells containing strings are taken into account.) I also removed Zipfile.seek() method. Can you check if the updated code for extendedopenpyxl.py is working for you?