Is there a way to auto-adjust Excel column widths with pandas.ExcelWriter?

python excel pandas openpyxl

131,698

Solution 1

Inspired by user6178746's answer, I have the following:

# Given a dict of dataframes, for example:
# dfs = {'gadgets': df_gadgets, 'widgets': df_widgets}

writer = pd.ExcelWriter(filename, engine='xlsxwriter')
for sheetname, df in dfs.items():  # loop through `dict` of dataframes
    df.to_excel(writer, sheet_name=sheetname)  # send df to writer
    worksheet = writer.sheets[sheetname]  # pull worksheet object
    for idx, col in enumerate(df):  # loop through all columns
        series = df[col]
        max_len = max((
            series.astype(str).map(len).max(),  # len of largest item
            len(str(series.name))  # len of column name/header
            )) + 1  # adding a little extra space
        worksheet.set_column(idx, idx, max_len)  # set column width
writer.save()

Solution 2

I'm posting this because I just ran into the same issue and found that the official documentation for Xlsxwriter and pandas still have this functionality listed as unsupported. I hacked together a solution that solved the issue i was having. I basically just iterate through each column and use worksheet.set_column to set the column width == the max length of the contents of that column.

One important note, however. This solution does not fit the column headers, simply the column values. That should be an easy change though if you need to fit the headers instead. Hope this helps someone :)

import pandas as pd
import sqlalchemy as sa
import urllib


read_server = 'serverName'
read_database = 'databaseName'

read_params = urllib.quote_plus("DRIVER={SQL Server};SERVER="+read_server+";DATABASE="+read_database+";TRUSTED_CONNECTION=Yes")
read_engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % read_params)

#Output some SQL Server data into a dataframe
my_sql_query = """ SELECT * FROM dbo.my_table """
my_dataframe = pd.read_sql_query(my_sql_query,con=read_engine)

#Set destination directory to save excel.
xlsFilepath = r'H:\my_project' + "\\" + 'my_file_name.xlsx'
writer = pd.ExcelWriter(xlsFilepath, engine='xlsxwriter')

#Write excel to file using pandas to_excel
my_dataframe.to_excel(writer, startrow = 1, sheet_name='Sheet1', index=False)

#Indicate workbook and worksheet for formatting
workbook = writer.book
worksheet = writer.sheets['Sheet1']

#Iterate through each column and set the width == the max length in that column. A padding length of 2 is also added.
for i, col in enumerate(my_dataframe.columns):
    # find length of column i
    column_len = my_dataframe[col].astype(str).str.len().max()
    # Setting the length if the column header is larger
    # than the max column value length
    column_len = max(column_len, len(col)) + 2
    # set the column length
    worksheet.set_column(i, i, column_len)
writer.save()

Solution 3

Dynamically adjust all the column lengths

writer = pd.ExcelWriter('/path/to/output/file.xlsx') 
df.to_excel(writer, sheet_name='sheetName', index=False, na_rep='NaN')

for column in df:
    column_length = max(df[column].astype(str).map(len).max(), len(column))
    col_idx = df.columns.get_loc(column)
    writer.sheets['sheetName'].set_column(col_idx, col_idx, column_length)

writer.save()

Manually adjust a column using Column Name

col_idx = df.columns.get_loc('columnName')
writer.sheets['sheetName'].set_column(col_idx, col_idx, 15)

Manually adjust a column using Column Index

writer.sheets['sheetName'].set_column(col_idx, col_idx, 15)

In case any of the above is failing with

AttributeError: 'Worksheet' object has no attribute 'set_column'

make sure to install xlsxwriter:

pip install xlsxwriter

For a more comprehensive explanation you can read the article How to Auto-Adjust the Width of Excel Columns with Pandas ExcelWriter on TDS.

Solution 4

There is a nice package that I started to use recently called StyleFrame.

it gets DataFrame and lets you to style it very easily...

by default the columns width is auto-adjusting.

for example:

from StyleFrame import StyleFrame
import pandas as pd

df = pd.DataFrame({'aaaaaaaaaaa': [1, 2, 3], 
                   'bbbbbbbbb': [1, 1, 1],
                   'ccccccccccc': [2, 3, 4]})
excel_writer = StyleFrame.ExcelWriter('example.xlsx')
sf = StyleFrame(df)
sf.to_excel(excel_writer=excel_writer, row_to_add_filters=0,
            columns_and_rows_to_freeze='B2')
excel_writer.save()

you can also change the columns width:

sf.set_column_width(columns=['aaaaaaaaaaa', 'bbbbbbbbb'],
                    width=35.3)

UPDATE 1

In version 1.4 best_fit argument was added to StyleFrame.to_excel. See the documentation.

UPDATE 2

Here's a sample of code that works for StyleFrame 3.x.x

from styleframe import StyleFrame
import pandas as pd

columns = ['aaaaaaaaaaa', 'bbbbbbbbb', 'ccccccccccc', ]
df = pd.DataFrame(data={
        'aaaaaaaaaaa': [1, 2, 3, ],
        'bbbbbbbbb': [1, 1, 1, ],
        'ccccccccccc': [2, 3, 4, ],
    }, columns=columns,
)
excel_writer = StyleFrame.ExcelWriter('example.xlsx')
sf = StyleFrame(df)
sf.to_excel(
    excel_writer=excel_writer, 
    best_fit=columns,
    columns_and_rows_to_freeze='B2', 
    row_to_add_filters=0,
)
excel_writer.save()

Solution 5

There is probably no automatic way to do it right now, but as you use openpyxl, the following line (adapted from another answer by user Bufke on how to do in manually) allows you to specify a sane value (in character widths):

writer.sheets['Summary'].column_dimensions['A'].width = 15

View more solutions

131,698

badideas

Updated on November 28, 2021

Comments

badideas over 2 years
I am being asked to generate some Excel reports. I am currently using pandas quite heavily for my data, so naturally I would like to use the pandas.ExcelWriter method to generate these reports. However the fixed column widths are a problem.

The code I have so far is simple enough. Say I have a dataframe called 'df':
```
writer = pd.ExcelWriter(excel_file_path, engine='openpyxl')
df.to_excel(writer, sheet_name="Summary")
```
I was looking over the pandas code, and I don't really see any options to set column widths. Is there a trick out there in the universe to make it such that the columns auto-adjust to the data? Or is there something I can do after the fact to the xlsx file to adjust the column widths?

(I am using the OpenPyXL library, and generating .xlsx files - if that makes any difference.)

Thank you.
- Jeff almost 11 years
  
  doesn't look possible at the moment, please open an issue for this enhancement on github (and maybe a PR?). doesn't look that hard to do.
- badideas almost 11 years
  
  thanks Jeff, i have submitted the issue. i'm not sure if i will have time to actually dive into the pandas codebase to solve it, but you never know :)
- Jeff almost 11 years
  
  yep....saw your issue.....comment on the issue if you need some help! (essentially need to pass an optional argument to to_excel, maybe col_style=dict which contains col header style elements (rather than the default header_style which seems to be hard coded now
- dmvianna over 10 years
  
  Link to pandas issue
Admin about 8 years

Good solution. I like how you used pandas instead of another package.
ᴊᴇsᴘᴇʀᴋ.ᴇᴛʜ over 7 years

FYI: In my case I needed to use "index=False" in the "df.to_excel(...)" call, or else the columns were off by 1
Heikki Pulkkinen almost 6 years

yep, I also had to add df.to_excel(writer, sheet_name=sheetname, index=False)
ac24 almost 6 years

If you can't use index=False (because you have a multiindex on rows), then you can get the index level depth with df.index.nlevels and then use this to add on to your set column call: worksheet.set_column(idx+nlevels, idx+nlevels, max_len). Otherwise the length is calculated for the first column of the frame, and then applied to the first column in the excel, which is probably the index.
Dascienz over 5 years

For anyone still looking for this answer, enumerate(df) should be enumerate(df.columns) since you're iterating over each column in df.
alichaudry over 5 years

@Dascienz the same way iterating over a dict actually iterates over the keys in the dict (you don't have to manually say dict.keys()), iterating over a pd.DataFrame iterates over the columns. You don't have to manually iterate over df.columns.
Dascienz over 5 years

@alichaudry Ah, I see! Thanks for clarification!
John Y over 5 years

The StyleFrame package may be easy to use, but I don't see how "by default the columns width is auto-adjusting". When I run the code sample you gave, all the columns are the same width, and all three headers are wrapped. Your sample data is also poorly chosen, because they are all almost the same width naturally. To really illustrate automatic adjustment, you should choose some really wide data and some narrow data. When I do this for myself, the column widths are still exactly the same as before. There was no adjustment whatsoever.
John Y over 5 years

Maybe at one point in StyleFrame's history, the column widths were automatically adjusted by default, but at least today, you have to specify the column or columns you want adjusted in the best_fit parameter. Also, when I tried this, I got very poor results.
Admin over 5 years

the width seems to be off 1 column. I tried enabling and disabling the index parameter but no dice.
Chaoste about 5 years

Pandas supports a nice notation for calculation string length and other stuff: series.astype(str).map(len).max() can be rewritten as: series.astype(str).str.len().max().
Serdia over 4 years

I think you need () inside max function: ` max(column_len(), len(col)) + 2`
ojdo over 4 years

The default ExcelWriter engine pandas is using has changed since 2013 to Xlsxwriter, which does not contain a column_dimensions attribute. If you want to keep using openpyxl, simply specify it when creating the writer using pd.ExcelWriter(excel_filename, engine='openpyxl')
ojdo over 4 years

@Sunil: check the other answers using Xlsxwriter as the engine to see how to specify the column width with today's default engine.
Nikhil VJ about 4 years

thanks! for those looking: How you add more styling to header for example: sf.apply_headers_style(Styler(bold=False)) it took me a long time to figure that out. And in the import statement, from StyleFrame import StyleFrame, Styler . here's all the options apart from bold: styleframe.readthedocs.io/en/2.0.5/…
Hagbard almost 4 years

Unfortunately, this answer is outdated and I only get import errors if I try to apply it as the API seems to have changed significantly.
user3019973 almost 4 years

I got following error when i replicated this code: AttributeError: 'str' object has no attribute 'to_excel'. It think it has something to do with the way "dataframe_list" is created. Mine is a list with 6 dataframe names
Umutambyi Gad almost 4 years

codes only doesn't answer the question you have to add some explanations or take time and read documentation about How do I write a good answer?
Brian almost 4 years

Hello! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.
rafat.ch over 3 years

Yes, the "dataframe_list" should have dataframes and not dataframe names.
DeepSpace over 3 years

@Hagbard as of version 3 the import should be from styleframe import StyleFrame in order to comply with PEP8 name conventions
Guido over 3 years

Works only on Windows and MacOS, not on Linux though
Guido over 3 years

I see an issue dealing with multi_index dataframes - just fail once passing index=True in the method to_excel
jmd about 3 years

it seems this works well for string but a lot less well for date format, where the actual width can be much larger than what is calculated here
jmb almost 3 years

Good, simple solution here. Keep in mind if you're using an index, df.columns will have a different shape than what df.to_excel(writer,sheet_name=...) will output in the excel file. That can misalign the enumerate's i with what worksheet.set_column is expecting. I resolved this with df.reset_index().to_excel(...), but there's probably a better solution.
parsecer almost 3 years

What is df here? Could you please show code, including df initialization?
Giorgos Myrianthous almost 3 years

@parsecer You can refer to the article I've shared at the bottom of the post.
MattG over 2 years

Worked perfectly, including the auto widths, explicit widths by column name and exception resolved by installing xlswriter. Thanks :)
Philippe Hebert over 2 years

If you get AttributeError: 'Worksheet' object has no attribute 'set_column', you may be missing XlsxWriter and pandas is falling back on openpyxl. pip install XlsxWriter should solve it :)