Fastest way to insert these dashes in python string?

25,060

Solution 1

You could use .join() to clean it up a little bit:

d = c['date']
'-'.join([d[:4], d[4:6], d[6:]])

Solution 2

Dates are first class objects in Python, with a rich interface for manipulating them. The library is datetime.

> import datetime
> datetime.datetime.strptime('20110503','%Y%m%d').date().isoformat()
'2011-05-03'

Don't reinvent the wheel!

Solution 3

You are better off using string formatting than string concatenation

c['date'] = '{}-{}-{}'.format(c['date'][0:4], c['date'][4:6], c['date'][6:])

String concatenation is generally slower because as you said above strings are immutable.

Solution 4

s = '20110104'


def option_1():
    return '-'.join([s[:4], s[4:6], s[6:]])

def option_1a():
    return '-'.join((s[:4], s[4:6], s[6:]))

def option_2():
    return '{}-{}-{}'.format(s[:4], s[4:6], s[6:])

def option_3():
    return '%s-%s-%s' % (s[:4], s[4:6], s[6:])

def option_original():
    return s[:4] + "-" + s[4:6] + "-" + s[6:]

Running %timeit on each yields these results

  • option_1: 35.9 ns per loop
  • option_1a: 35.8 ns per loop
  • option_2: 36 ns per loop
  • option_3: 35.8 ns per loop
  • option_original: 36 ns per loop

So... pick the most readable because the performance improvements are marginal

Solution 5

Add hyphen to a series of strings to datetime

import datetime
for i in range (0,len(c.date)):
  c.date[i] = datetime.datetime.strptime(c.date[i],'%Y%m%d').date().isoformat()
Share:
25,060
LittleBobbyTables
Author by

LittleBobbyTables

Updated on July 23, 2020

Comments

  • LittleBobbyTables
    LittleBobbyTables almost 4 years

    So I know Python strings are immutable, but I have a string:

    c['date'] = "20110104"
    

    Which I would like to convert to

    c['date'] = "2011-01-04"
    

    My code:

    c['date'] = c['date'][0:4] + "-" + c['date'][4:6] + "-" + c['date'][6:]
    

    Seems a bit convoluted, no? Would it be best to save it as a separate variable and then do the same? Or would there basically be no difference?

  • mgilson
    mgilson over 11 years
    If performance is what OP means by "faster", I've found that '-'.join((d[:4],d[4:6],d[6:])) is marginally faster (i.e. tuple instead of a list).
  • LittleBobbyTables
    LittleBobbyTables over 11 years
    This looks very clean and pythonic to me :)
  • Abhishek Upadhyaya
    Abhishek Upadhyaya over 3 years
    Simple and elegant solution!
  • Richard
    Richard over 3 years
    thus: c['date'] = pd.to_datetime(c['date'], format = '%Y%m%d') . See: stackoverflow.com/questions/26763344/…