Check if string has date, any format

104,847

Solution 1

The parse function in dateutils.parser is capable of parsing many date string formats to a datetime object.

If you simply want to know whether a particular string could represent or contain a valid date, you could try the following simple function:

from dateutil.parser import parse

def is_date(string, fuzzy=False):
    """
    Return whether the string can be interpreted as a date.

    :param string: str, string to check for date
    :param fuzzy: bool, ignore unknown tokens in string if True
    """
    try: 
        parse(string, fuzzy=fuzzy)
        return True

    except ValueError:
        return False

Then you have:

>>> is_date("1990-12-1")
True
>>> is_date("2005/3")
True
>>> is_date("Jan 19, 1990")
True
>>> is_date("today is 2019-03-27")
False
>>> is_date("today is 2019-03-27", fuzzy=True)
True
>>> is_date("Monday at 12:01am")
True
>>> is_date("xyz_not_a_date")
False
>>> is_date("yesterday")
False

Custom parsing

parse might recognise some strings as dates which you don't want to treat as dates. For example:

  • Parsing "12" and "1999" will return a datetime object representing the current date with the day and year substituted for the number in the string

  • "23, 4" and "23 4" will be parsed as datetime.datetime(2023, 4, 16, 0, 0).

  • "Friday" will return the date of the nearest Friday in the future.
  • Similarly "August" corresponds to the current date with the month changed to August.

Also parse is not locale aware, so does not recognise months or days of the week in languages other than English.

Both of these issues can be addressed to some extent by using a custom parserinfo class, which defines how month and day names are recognised:

from dateutil.parser import parserinfo

class CustomParserInfo(parserinfo):

    # three months in Spanish for illustration
    MONTHS = [("Enero", "Enero"), ("Feb", "Febrero"), ("Marzo", "Marzo")]

An instance of this class can then be used with parse:

>>> parse("Enero 1990")
# ValueError: Unknown string format
>>> parse("Enero 1990", parserinfo=CustomParserInfo())
datetime.datetime(1990, 1, 27, 0, 0)

Solution 2

If you want to parse those particular formats, you can just match against a list of formats:

txt='''\
Jan 19, 1990
January 19, 1990
Jan 19,1990
01/19/1990
01/19/90
1990
Jan 1990
January1990'''

import datetime as dt

fmts = ('%Y','%b %d, %Y','%b %d, %Y','%B %d, %Y','%B %d %Y','%m/%d/%Y','%m/%d/%y','%b %Y','%B%Y','%b %d,%Y')

parsed=[]
for e in txt.splitlines():
    for fmt in fmts:
        try:
           t = dt.datetime.strptime(e, fmt)
           parsed.append((e, fmt, t)) 
           break
        except ValueError as err:
           pass

# check that all the cases are handled        
success={t[0] for t in parsed}
for e in txt.splitlines():
    if e not in success:
        print e    

for t in parsed:
    print '"{:20}" => "{:20}" => {}'.format(*t) 

Prints:

"Jan 19, 1990        " => "%b %d, %Y           " => 1990-01-19 00:00:00
"January 19, 1990    " => "%B %d, %Y           " => 1990-01-19 00:00:00
"Jan 19,1990         " => "%b %d,%Y            " => 1990-01-19 00:00:00
"01/19/1990          " => "%m/%d/%Y            " => 1990-01-19 00:00:00
"01/19/90            " => "%m/%d/%y            " => 1990-01-19 00:00:00
"1990                " => "%Y                  " => 1990-01-01 00:00:00
"Jan 1990            " => "%b %Y               " => 1990-01-01 00:00:00
"January1990         " => "%B%Y                " => 1990-01-01 00:00:00
Share:
104,847

Related videos on Youtube

zack_falcon
Author by

zack_falcon

I'm currently an IT Student. I love computer games, good food, skating, football, and R/C Trucks.

Updated on July 05, 2022

Comments

  • zack_falcon
    zack_falcon almost 2 years

    How do I check if a string can be parsed to a date?

    • Jan 19, 1990
    • January 19, 1990
    • Jan 19,1990
    • 01/19/1990
    • 01/19/90
    • 1990
    • Jan 1990
    • January1990

    These are all valid dates. If there's any concern regarding the lack of space in between stuff in item #3 and the last item above, that can be easily remedied via automatically inserting a space in between letters/characters and numbers, if so needed.

    But first, the basics:

    I tried putting it in an if statement:

    if datetime.strptime(item, '%Y') or datetime.strptime(item, '%b %d %y') or datetime.strptime(item, '%b %d %Y')  or datetime.strptime(item, '%B %d %y') or datetime.strptime(item, '%B %d %Y'):
    

    But that's in a try-except block, and keeps returning something like this:

    16343 time data 'JUNE1890' does not match format '%Y'
    

    Unless, it met the first condition in the if statement.

    To clarify, I don't actually need the value of the date - I just want to know if it is. Ideally, it would've been something like this:

    if item is date:
        print date
    else:
        print "Not a date"
    

    Is there any way to do this?

    • Jeff Mercado
      Jeff Mercado almost 10 years
      It would just be easier to normalize all the dates to a single format. You can't expect to handle every single corner case correctly if you accept essentially free-form dates. That plus there's the issue of dealing with ambiguities in the dates.
    • tobias_k
      tobias_k almost 10 years
      Without knowing the format, how would you parse a date like 04/06/08? Could be June 4 2008, or April 6 2008, or maybe June 8 2004...
    • zack_falcon
      zack_falcon almost 10 years
      @tobias_k, my apologies for not clarifying further; I've added some details to the question. It matters not what the parsed date looks like - I just need to know if it's a date.
    • dawg
      dawg almost 10 years
      Is 'tomorrow' a date? Is 'later today' a date? If this is a non-trivial project, you might want to consider NSScanner (on OS X) which will parse those as dates
    • jscs
      jscs almost 10 years
      @dawg: NSScanner is really just a lexer, and a barebones one at that; it has no inherent parsing ability. Python has the somewhat similar re.Scanner. NSDateFormatter would be the thing to use if you wanted to bring Cocoa in.
  • zack_falcon
    zack_falcon almost 10 years
    I've installed that extension. How do I import it? Also, could I use that datetime.datetime(2023, 4, 16, 0, 0) as a boolean of sorts, or in an if statement? Such that if it was parsed, it's a date, and if it wasn't, then it's not?
  • Alex Riley
    Alex Riley almost 10 years
    Hi Zack, to import the parse function just use from dateutil.parser import parse. I'm not sure quite what you mean in your second question... datetime objects will evaluate to True so you could use them with if statements if you liked.
  • zack_falcon
    zack_falcon almost 10 years
    If it's not too much to ask, could you also help me with the follow up to this question? It is ultimately, what I was going to use the code for. stackoverflow.com/questions/25347224/…
  • Alex Riley
    Alex Riley almost 10 years
    @zack_falcon - I added an answer detailing how I'd approach your other problem. Hope it helps!
  • citynorman
    citynorman over 6 years
    Is there a way to get .parse() to return the format string in addition to the datetime object?
  • Juan C. Roldán
    Juan C. Roldán about 5 years
    This is the first result in Google when looking for "check string has date python", so let's give some insights about the parse function: a) it has many false positives, i.e: parse("4") will return a date; b) it tests the full string as a datetime, i.e: parse("Today is 2019-03-26") will raise a ValueError unless you use the fuzzy param; and c) it only works with English locales, i.e: august is understood but agosto is not.
  • Alex Riley
    Alex Riley about 5 years
    @JuanCarlos: thanks for bringing this to my attention. I've edited the answer to try and address the points you've raised.
  • David
    David over 4 years
    amazing, been using dateutil for years and didn't realize it had a fuzzy option. thx @AlexRiley!
  • Abhishek Saxena
    Abhishek Saxena over 4 years
    For instance "Hey, I am Bob and I was born on November 20th, 1930." How to customize to return true for this string or similar strings?
  • Anatoly Alekseev
    Anatoly Alekseev over 3 years
    Great, but I recommend removing ValueError, 'cause your func fails on, for example, is_date('m6061717610')
  • ToMakPo
    ToMakPo over 2 years
    What would I need to add to include times?
  • Ela782
    Ela782 about 2 years
    Even with the CustomParserInfo, parse("4") is still accepted as a valid date. Is there any better way to do this in 2022, while also avoiding a more complicated solution like @dawg's answer?
  • Ela782
    Ela782 about 2 years
    Nevermind, @dawg's solution is much easier than it looks. A simple datetime.datetime.strptime(row[0], ' %D %B %y') will show you whether a given date exactly corresponds to the given pattern.