Is there a way to auto-adjust Excel column widths with pandas.ExcelWriter?
Solution 1
Inspired by user6178746's answer, I have the following:
# Given a dict of dataframes, for example:
# dfs = {'gadgets': df_gadgets, 'widgets': df_widgets}
writer = pd.ExcelWriter(filename, engine='xlsxwriter')
for sheetname, df in dfs.items(): # loop through `dict` of dataframes
df.to_excel(writer, sheet_name=sheetname) # send df to writer
worksheet = writer.sheets[sheetname] # pull worksheet object
for idx, col in enumerate(df): # loop through all columns
series = df[col]
max_len = max((
series.astype(str).map(len).max(), # len of largest item
len(str(series.name)) # len of column name/header
)) + 1 # adding a little extra space
worksheet.set_column(idx, idx, max_len) # set column width
writer.save()
Solution 2
I'm posting this because I just ran into the same issue and found that the official documentation for Xlsxwriter and pandas still have this functionality listed as unsupported. I hacked together a solution that solved the issue i was having. I basically just iterate through each column and use worksheet.set_column to set the column width == the max length of the contents of that column.
One important note, however. This solution does not fit the column headers, simply the column values. That should be an easy change though if you need to fit the headers instead. Hope this helps someone :)
import pandas as pd
import sqlalchemy as sa
import urllib
read_server = 'serverName'
read_database = 'databaseName'
read_params = urllib.quote_plus("DRIVER={SQL Server};SERVER="+read_server+";DATABASE="+read_database+";TRUSTED_CONNECTION=Yes")
read_engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % read_params)
#Output some SQL Server data into a dataframe
my_sql_query = """ SELECT * FROM dbo.my_table """
my_dataframe = pd.read_sql_query(my_sql_query,con=read_engine)
#Set destination directory to save excel.
xlsFilepath = r'H:\my_project' + "\\" + 'my_file_name.xlsx'
writer = pd.ExcelWriter(xlsFilepath, engine='xlsxwriter')
#Write excel to file using pandas to_excel
my_dataframe.to_excel(writer, startrow = 1, sheet_name='Sheet1', index=False)
#Indicate workbook and worksheet for formatting
workbook = writer.book
worksheet = writer.sheets['Sheet1']
#Iterate through each column and set the width == the max length in that column. A padding length of 2 is also added.
for i, col in enumerate(my_dataframe.columns):
# find length of column i
column_len = my_dataframe[col].astype(str).str.len().max()
# Setting the length if the column header is larger
# than the max column value length
column_len = max(column_len, len(col)) + 2
# set the column length
worksheet.set_column(i, i, column_len)
writer.save()
Solution 3
Dynamically adjust all the column lengths
writer = pd.ExcelWriter('/path/to/output/file.xlsx')
df.to_excel(writer, sheet_name='sheetName', index=False, na_rep='NaN')
for column in df:
column_length = max(df[column].astype(str).map(len).max(), len(column))
col_idx = df.columns.get_loc(column)
writer.sheets['sheetName'].set_column(col_idx, col_idx, column_length)
writer.save()
Manually adjust a column using Column Name
col_idx = df.columns.get_loc('columnName')
writer.sheets['sheetName'].set_column(col_idx, col_idx, 15)
Manually adjust a column using Column Index
writer.sheets['sheetName'].set_column(col_idx, col_idx, 15)
In case any of the above is failing with
AttributeError: 'Worksheet' object has no attribute 'set_column'
make sure to install xlsxwriter
:
pip install xlsxwriter
For a more comprehensive explanation you can read the article How to Auto-Adjust the Width of Excel Columns with Pandas ExcelWriter on TDS.
Solution 4
There is a nice package that I started to use recently called StyleFrame.
it gets DataFrame and lets you to style it very easily...
by default the columns width is auto-adjusting.
for example:
from StyleFrame import StyleFrame
import pandas as pd
df = pd.DataFrame({'aaaaaaaaaaa': [1, 2, 3],
'bbbbbbbbb': [1, 1, 1],
'ccccccccccc': [2, 3, 4]})
excel_writer = StyleFrame.ExcelWriter('example.xlsx')
sf = StyleFrame(df)
sf.to_excel(excel_writer=excel_writer, row_to_add_filters=0,
columns_and_rows_to_freeze='B2')
excel_writer.save()
you can also change the columns width:
sf.set_column_width(columns=['aaaaaaaaaaa', 'bbbbbbbbb'],
width=35.3)
UPDATE 1
In version 1.4 best_fit
argument was added to StyleFrame.to_excel
.
See the documentation.
UPDATE 2
Here's a sample of code that works for StyleFrame 3.x.x
from styleframe import StyleFrame
import pandas as pd
columns = ['aaaaaaaaaaa', 'bbbbbbbbb', 'ccccccccccc', ]
df = pd.DataFrame(data={
'aaaaaaaaaaa': [1, 2, 3, ],
'bbbbbbbbb': [1, 1, 1, ],
'ccccccccccc': [2, 3, 4, ],
}, columns=columns,
)
excel_writer = StyleFrame.ExcelWriter('example.xlsx')
sf = StyleFrame(df)
sf.to_excel(
excel_writer=excel_writer,
best_fit=columns,
columns_and_rows_to_freeze='B2',
row_to_add_filters=0,
)
excel_writer.save()
Solution 5
There is probably no automatic way to do it right now, but as you use openpyxl, the following line (adapted from another answer by user Bufke on how to do in manually) allows you to specify a sane value (in character widths):
writer.sheets['Summary'].column_dimensions['A'].width = 15
Related videos on Youtube
badideas
Updated on November 28, 2021Comments
-
badideas over 2 years
I am being asked to generate some Excel reports. I am currently using pandas quite heavily for my data, so naturally I would like to use the pandas.ExcelWriter method to generate these reports. However the fixed column widths are a problem.
The code I have so far is simple enough. Say I have a dataframe called 'df':
writer = pd.ExcelWriter(excel_file_path, engine='openpyxl') df.to_excel(writer, sheet_name="Summary")
I was looking over the pandas code, and I don't really see any options to set column widths. Is there a trick out there in the universe to make it such that the columns auto-adjust to the data? Or is there something I can do after the fact to the xlsx file to adjust the column widths?
(I am using the OpenPyXL library, and generating .xlsx files - if that makes any difference.)
Thank you.
-
Jeff almost 11 yearsdoesn't look possible at the moment, please open an issue for this enhancement on github (and maybe a PR?). doesn't look that hard to do.
-
badideas almost 11 yearsthanks Jeff, i have submitted the issue. i'm not sure if i will have time to actually dive into the pandas codebase to solve it, but you never know :)
-
Jeff almost 11 yearsyep....saw your issue.....comment on the issue if you need some help! (essentially need to pass an optional argument to
to_excel
, maybecol_style=dict
which contains col header style elements (rather than the defaultheader_style
which seems to be hard coded now -
dmvianna over 10 years
-
-
Admin about 8 yearsGood solution. I like how you used pandas instead of another package.
-
ᴊᴇsᴘᴇʀᴋ.ᴇᴛʜ over 7 yearsFYI: In my case I needed to use "index=False" in the "df.to_excel(...)" call, or else the columns were off by 1
-
Heikki Pulkkinen almost 6 yearsyep, I also had to add df.to_excel(writer, sheet_name=sheetname, index=False)
-
ac24 almost 6 yearsIf you can't use index=False (because you have a multiindex on rows), then you can get the index level depth with df.index.nlevels and then use this to add on to your set column call:
worksheet.set_column(idx+nlevels, idx+nlevels, max_len)
. Otherwise the length is calculated for the first column of the frame, and then applied to the first column in the excel, which is probably the index. -
Dascienz over 5 yearsFor anyone still looking for this answer,
enumerate(df)
should beenumerate(df.columns)
since you're iterating over each column indf
. -
alichaudry over 5 years@Dascienz the same way iterating over a
dict
actually iterates over the keys in thedict
(you don't have to manually saydict.keys()
), iterating over apd.DataFrame
iterates over the columns. You don't have to manually iterate overdf.columns
. -
Dascienz over 5 years@alichaudry Ah, I see! Thanks for clarification!
-
John Y over 5 yearsThe StyleFrame package may be easy to use, but I don't see how "by default the columns width is auto-adjusting". When I run the code sample you gave, all the columns are the same width, and all three headers are wrapped. Your sample data is also poorly chosen, because they are all almost the same width naturally. To really illustrate automatic adjustment, you should choose some really wide data and some narrow data. When I do this for myself, the column widths are still exactly the same as before. There was no adjustment whatsoever.
-
John Y over 5 yearsMaybe at one point in StyleFrame's history, the column widths were automatically adjusted by default, but at least today, you have to specify the column or columns you want adjusted in the
best_fit
parameter. Also, when I tried this, I got very poor results. -
Admin over 5 yearsthe width seems to be off 1 column. I tried enabling and disabling the
index
parameter but no dice. -
Chaoste about 5 yearsPandas supports a nice notation for calculation string length and other stuff:
series.astype(str).map(len).max()
can be rewritten as:series.astype(str).str.len().max()
. -
Serdia over 4 yearsI think you need
()
inside max function: ` max(column_len(), len(col)) + 2` -
ojdo over 4 yearsThe default ExcelWriter engine pandas is using has changed since 2013 to Xlsxwriter, which does not contain a
column_dimensions
attribute. If you want to keep using openpyxl, simply specify it when creating the writer usingpd.ExcelWriter(excel_filename, engine='openpyxl')
-
ojdo over 4 years@Sunil: check the other answers using
Xlsxwriter
as the engine to see how to specify the column width with today's default engine. -
Nikhil VJ about 4 yearsthanks! for those looking: How you add more styling to header for example:
sf.apply_headers_style(Styler(bold=False))
it took me a long time to figure that out. And in the import statement,from StyleFrame import StyleFrame, Styler
. here's all the options apart from bold: styleframe.readthedocs.io/en/2.0.5/… -
Hagbard almost 4 yearsUnfortunately, this answer is outdated and I only get import errors if I try to apply it as the API seems to have changed significantly.
-
user3019973 almost 4 yearsI got following error when i replicated this code: AttributeError: 'str' object has no attribute 'to_excel'. It think it has something to do with the way "dataframe_list" is created. Mine is a list with 6 dataframe names
-
Umutambyi Gad almost 4 yearscodes only doesn't answer the question you have to add some explanations or take time and read documentation about How do I write a good answer?
-
Brian almost 4 yearsHello! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.
-
rafat.ch over 3 yearsYes, the "dataframe_list" should have dataframes and not dataframe names.
-
DeepSpace over 3 years@Hagbard as of version 3 the import should be
from styleframe import StyleFrame
in order to comply with PEP8 name conventions -
Guido over 3 yearsWorks only on Windows and MacOS, not on Linux though
-
Guido over 3 yearsI see an issue dealing with multi_index dataframes - just fail once passing
index=True
in the methodto_excel
-
jmd about 3 yearsit seems this works well for string but a lot less well for date format, where the actual width can be much larger than what is calculated here
-
jmb almost 3 yearsGood, simple solution here. Keep in mind if you're using an index,
df.columns
will have a different shape than whatdf.to_excel(writer,sheet_name=...)
will output in the excel file. That can misalign theenumerate
'si
with whatworksheet.set_column
is expecting. I resolved this withdf.reset_index().to_excel(...)
, but there's probably a better solution. -
parsecer almost 3 yearsWhat is
df
here? Could you please show code, includingdf
initialization? -
Giorgos Myrianthous almost 3 years@parsecer You can refer to the article I've shared at the bottom of the post.
-
MattG over 2 yearsWorked perfectly, including the auto widths, explicit widths by column name and exception resolved by installing xlswriter. Thanks :)
-
Philippe Hebert over 2 yearsIf you get
AttributeError: 'Worksheet' object has no attribute 'set_column'
, you may be missing XlsxWriter and pandas is falling back on openpyxl.pip install XlsxWriter
should solve it :)