Average timedelta in list

18,975

Solution 1

Btw, if you have a list of timedeltas or datetimes, why do you even do any math yourself?

datetimes = [ ... ]

# subtracting datetimes gives timedeltas
timedeltas = [datetimes[i-1]-datetimes[i] for i in range(1, len(datetimes))]

# giving datetime.timedelta(0) as the start value makes sum work on tds 
average_timedelta = sum(timedeltas, datetime.timedelta(0)) / len(timedeltas)

Solution 2

Try this:

from itertools import izip

def average(items):   
    total = sum((next - last).seconds + (next - last).days * 86400
                for next, last in izip(items[1:], items))
     return total / (len(items) - 1)

In my opinion doing it like this is more readable. A comment for less mathematically inclined readers of your code might help to explain how your are calculating each delta. For what it's worth, one generator expression has the least (and I think least slow) opcode instructions of anything I looked at.

  # The way in your question compiles to....
  3           0 LOAD_CONST               1 (<code object <lambda> at 0xb7760ec0, file 

"scratch.py", line 3>)
              3 MAKE_FUNCTION            0
              6 STORE_DEREF              1 (delta)

  4           9 LOAD_GLOBAL              0 (sum)
             12 LOAD_CLOSURE             0 (items)
             15 LOAD_CLOSURE             1 (delta)
             18 BUILD_TUPLE              2
             21 LOAD_CONST               2 (<code object <genexpr> at 0xb77c0a40, file "scratch.py", line 4>)
             24 MAKE_CLOSURE             0
             27 LOAD_GLOBAL              1 (range)
             30 LOAD_CONST               3 (1)
             33 LOAD_GLOBAL              2 (len)
             36 LOAD_DEREF               0 (items)
             39 CALL_FUNCTION            1
             42 CALL_FUNCTION            2
             45 GET_ITER            
             46 CALL_FUNCTION            1
             49 CALL_FUNCTION            1
             52 STORE_FAST               1 (total)

  5          55 LOAD_FAST                1 (total)
             58 LOAD_GLOBAL              2 (len)
             61 LOAD_DEREF               0 (items)
             64 CALL_FUNCTION            1
             67 LOAD_CONST               3 (1)
             70 BINARY_SUBTRACT     
             71 BINARY_DIVIDE       
             72 STORE_FAST               2 (average)
             75 LOAD_CONST               0 (None)
             78 RETURN_VALUE        
None
#
#doing it with just one generator expression and itertools...

  4           0 LOAD_GLOBAL              0 (sum)
              3 LOAD_CONST               1 (<code object <genexpr> at 0xb777eec0, file "scratch.py", line 4>)
              6 MAKE_FUNCTION            0

  5           9 LOAD_GLOBAL              1 (izip)
             12 LOAD_FAST                0 (items)
             15 LOAD_CONST               2 (1)
             18 SLICE+1             
             19 LOAD_FAST                0 (items)
             22 CALL_FUNCTION            2
             25 GET_ITER            
             26 CALL_FUNCTION            1
             29 CALL_FUNCTION            1
             32 STORE_FAST               1 (total)

  6          35 LOAD_FAST                1 (total)
             38 LOAD_GLOBAL              2 (len)
             41 LOAD_FAST                0 (items)
             44 CALL_FUNCTION            1
             47 LOAD_CONST               2 (1)
             50 BINARY_SUBTRACT     
             51 BINARY_DIVIDE       
             52 RETURN_VALUE        
None

In particular, dropping the lambda allows us to avoid making a closure, building a tuple and loading two closures. Five functions get called either way. Of course this sort of concern with performance is sort of ridiculous but it is nice to know what's going on under the hood. The most important thing is readability and I think that doing it this way scores high on that as well.

Share:
18,975

Related videos on Youtube

shinn
Author by

shinn

Updated on May 02, 2022

Comments

  • shinn
    shinn almost 2 years

    I want to calculate the avarage timedelta between dates in a list. Although the following works well, I'm wondering if there's a smarter way?

    delta = lambda last, next: (next - last).seconds + (next - last).days * 86400   
    total = sum(delta(items[i-1], items[i]) for i in range(1, len(items)))
    average = total / (len(items) - 1)
    
    • aaronasterling
      aaronasterling over 13 years
      adding one more 0 to the end of 8640 would be a good start ;)
  • shinn
    shinn over 13 years
    Yeah, that's much better. Thanks!
  • aaronasterling
    aaronasterling over 13 years
    +1 Because neither OP nor I knew that was a possibility. datetime crap is even more boring than strings ;)
  • aaronasterling
    aaronasterling over 13 years
    @shinn, if you accept THC4k`s answer, then I can delete this one.
  • shinn
    shinn over 13 years
    You shouldn't delete it. I like the way with izip.
  • shinn
    shinn over 13 years
    I'll take your way to calculate the average and aaronasterling's to get the deltas. Thanks =)
  • abukaj
    abukaj about 7 years
    It is not very Pythonic to iterate over indices. I would go with: [a - b for a, b in zip(datetimes[:-1], datetimes[1:])]
  • sachleen
    sachleen over 6 years
    In this example it should be datetimes[i]-datetimes[i-1]