Format certain floating dataframe columns into percentage in pandas

189,614

Solution 1

replace the values using the round function, and format the string representation of the percentage numbers:

df['var2'] = pd.Series([round(val, 2) for val in df['var2']], index = df.index)
df['var3'] = pd.Series(["{0:.2f}%".format(val * 100) for val in df['var3']], index = df.index)

The round function rounds a floating point number to the number of decimal places provided as second argument to the function.

String formatting allows you to represent the numbers as you wish. You can change the number of decimal places shown by changing the number before the f.

p.s. I was not sure if your 'percentage' numbers had already been multiplied by 100. If they have then clearly you will want to change the number of decimals displayed, and remove the hundred multiplication.

Solution 2

The accepted answer suggests to modify the raw data for presentation purposes, something you generally do not want. Imagine you need to make further analyses with these columns and you need the precision you lost with rounding.

You can modify the formatting of individual columns in data frames, in your case:

output = df.to_string(formatters={
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format
})
print(output)

For your information '{:,.2%}'.format(0.214) yields 21.40%, so no need for multiplying by 100.

You don't have a nice HTML table anymore but a text representation. If you need to stay with HTML use the to_html function instead.

from IPython.core.display import display, HTML
output = df.to_html(formatters={
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format
})
display(HTML(output))

Update

As of pandas 0.17.1, life got easier and we can get a beautiful html table right away:

df.style.format({
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format,
})

Solution 3

You could also set the default format for float :

pd.options.display.float_format = '{:.2%}'.format

Use '{:.2%}' instead of '{:.2f}%' - The former converts 0.41 to 41.00% (correctly), the latter to 0.41% (incorrectly)

Solution 4

Often times we are interested in calculating the full significant digits, but for the visual aesthetics, we may want to see only few decimal point when we display the dataframe.

In jupyter-notebook, pandas can utilize the html formatting taking advantage of the method called style.

For the case of just seeing two significant digits of some columns, we can use this code snippet:

Given dataframe

import numpy as np
import pandas as pd

df = pd.DataFrame({'var1': [1.458315, 1.576704, 1.629253, 1.6693310000000001, 1.705139, 1.740447, 1.77598, 1.812037, 1.85313, 1.9439849999999999],
          'var2': [1.500092, 1.6084450000000001, 1.652577, 1.685456, 1.7120959999999998, 1.741961, 1.7708009999999998, 1.7993270000000001, 1.8229819999999999, 1.8684009999999998],
          'var3': [-0.0057090000000000005, -0.005122, -0.0047539999999999995, -0.003525, -0.003134, -0.0012230000000000001, -0.0017230000000000001, -0.002013, -0.001396, 0.005732]})

print(df)
       var1      var2      var3
0  1.458315  1.500092 -0.005709
1  1.576704  1.608445 -0.005122
2  1.629253  1.652577 -0.004754
3  1.669331  1.685456 -0.003525
4  1.705139  1.712096 -0.003134
5  1.740447  1.741961 -0.001223
6  1.775980  1.770801 -0.001723
7  1.812037  1.799327 -0.002013
8  1.853130  1.822982 -0.001396
9  1.943985  1.868401  0.005732

Style to get required format

    df.style.format({'var1': "{:.2f}",'var2': "{:.2f}",'var3': "{:.2%}"})

Gives:

     var1   var2    var3
id          
0   1.46    1.50    -0.57%
1   1.58    1.61    -0.51%
2   1.63    1.65    -0.48%
3   1.67    1.69    -0.35%
4   1.71    1.71    -0.31%
5   1.74    1.74    -0.12%
6   1.78    1.77    -0.17%
7   1.81    1.80    -0.20%
8   1.85    1.82    -0.14%
9   1.94    1.87    0.57%

Update

If display command is not found try following:

from IPython.display import display

df_style = df.style.format({'var1': "{:.2f}",'var2': "{:.2f}",'var3': "{:.2%}"})

display(df_style)

Requirements

  • To use display command, you need to have installed Ipython in your machine.
  • The display command does not work in online python interpreter which do not have IPyton installed such as https://repl.it/languages/python3
  • The display command works in jupyter-notebook, jupyter-lab, Google-colab, kaggle-kernels, IBM-watson,Mode-Analytics and many other platforms out of the box, you do not even have to import display from IPython.display

Solution 5

As suggested by @linqu you should not change your data for presentation. Since pandas 0.17.1, (conditional) formatting was made easier. Quoting the documentation:

You can apply conditional formatting, the visual styling of a DataFrame depending on the data within, by using the DataFrame.style property. This is a property that returns a pandas.Styler object, which has useful methods for formatting and displaying DataFrames.

For your example, that would be (the usual table will show up in Jupyter):

df.style.format({
    'var1': '{:,.2f}'.format,
    'var2': '{:,.2f}'.format,
    'var3': '{:,.2%}'.format,
})
Share:
189,614
user3576212
Author by

user3576212

Updated on July 08, 2022

Comments

  • user3576212
    user3576212 almost 2 years

    I am trying to write a paper in IPython notebook, but encountered some issues with display format. Say I have following dataframe df, is there any way to format var1 and var2 into 2 digit decimals and var3 into percentages.

           var1        var2         var3    
    id                                              
    0    1.458315    1.500092   -0.005709   
    1    1.576704    1.608445   -0.005122    
    2    1.629253    1.652577   -0.004754    
    3    1.669331    1.685456   -0.003525   
    4    1.705139    1.712096   -0.003134   
    5    1.740447    1.741961   -0.001223   
    6    1.775980    1.770801   -0.001723    
    7    1.812037    1.799327   -0.002013    
    8    1.853130    1.822982   -0.001396    
    9    1.943985    1.868401    0.005732
    

    The numbers inside are not multiplied by 100, e.g. -0.0057=-0.57%.

  • user3576212
    user3576212 almost 10 years
    Thanks, will this change the actual values within each column?
  • Woody Pride
    Woody Pride almost 10 years
    Yes, if that is not desired, then just create new columns with those variables in. As far as I know, there is no way to specify how output appears beyond what the data actually are.
  • Ben Southgate
    Ben Southgate almost 10 years
    To round the values in a series you can also just use df['var2'].round(2)
  • Romain Jouin
    Romain Jouin almost 9 years
    You could also set the default format for float : pd.options.display.float_format = '{:.2f}%'.format
  • Frames Catherine White
    Frames Catherine White almost 9 years
    @romain That's a great suggestion (for some use-cases) it should be its own answer (so I can upvote it) Though it does need tweak to multiply by 100.
  • Jim
    Jim over 8 years
    Good to know and relevant to OP's question about outputting in an python notebook
  • Afflatus
    Afflatus over 7 years
    If you have n or a variable amount of columns in your dataframe and you want to apply the same formatting across all columns, but you may not know all the column headers in advance, you don't have to put the formatters in a dictionary, you can do a list and do it creatively like this: output = df.to_html(formatters=n * ['{:,.2%}'.format])
  • Hugo Ideler
    Hugo Ideler almost 6 years
    And if the percentages are still given in decimals (e.g. when using df.pct_change()): pd.options.display.float_format = '{:.2%}'.format
  • Wes Turner
    Wes Turner over 5 years
    A standard set of these in a dict with attr access would be great.
  • FuzzyDuck
    FuzzyDuck almost 5 years
    This is the most Pythonic answer.
  • philippjfr
    philippjfr over 4 years
    This is a way better answer than the accepted one. Changing the formatting is much preferable to actually changing the underlying values.
  • MarianD
    MarianD over 4 years
    The parts .format are not needed, you may omit them.
  • zwornik
    zwornik about 4 years
    @Poudel This is not working. I have used exacly the same code as yours and var3 is not formatted as percentage
  • zwornik
    zwornik about 4 years
    This is not working. I have used exacly the same code as yours
  • zwornik
    zwornik about 4 years
    df.style.format({'var3': '{:,.2%}'}) - this is not working. Values remain unchanged i.e. without %
  • BhishanPoudel
    BhishanPoudel about 4 years
    @zwornik try display(df.style.format({'var1': "{:.2f}",'var2': "{:.2f}",'var3': "{:.2%}"}))
  • zwornik
    zwornik about 4 years
    @Poudel It worked now. There is one superflous bracket at the end. It should be: df_style = df.style.format({'var1': "{:.2f}",'var2': "{:.2f}",'var3': "{:.2%}"}) Thanks!
  • Tim
    Tim almost 4 years
    as suggested by @linqu, you generally do not want to modify data for display.
  • theFrok
    theFrok almost 4 years
    @zwornik % needs to be outside the brackets in '{:.2f}%'
  • DISC-O
    DISC-O almost 4 years
    Is there a way to display a column as percentage without converting it to a string?
  • Woody Pride
    Woody Pride almost 4 years
    see below answer which is better
  • Fed
    Fed over 3 years
    Had an issue with the index being non unique, so just had to df.reset_index(inplace= True) and then apply the .style.format. worked perfectly, thank you.
  • JuSTMOnIcAjUSTmONiCAJusTMoNICa
    JuSTMOnIcAjUSTmONiCAJusTMoNICa about 3 years
    @theFrok Note there's no f in '{:,.2%}'. @zwornik Apparently df.style.format doesn't change the DataFrame, but IronPython renders the result as HTML. You can use df.style.format(...).render() to get the HTML yourself. Or, use df.apply directly, which is what happens internally, per documentation.
  • Sahar
    Sahar about 3 years
    The series should be converted to data frame first: df[num_cols].to_frame().style.format('{:,.3f}%')
  • fransua
    fransua almost 2 years
    As a fact, When I use this answer, I got the message : AttributeError: 'Styler' object has no attribute 'head'