Fastest way to parse a YYYYMMdd date in Java

30,306

Solution 1

As you see below, the performance of the date processing only is relevant when you look at millions of iterations. Instead, you should choose a solution that is easy to read and maintain.

Although you could use SimpleDateFormat, it is not reentrant so should be avoided. The best solution is to use the great Joda time classes:

private static final DateTimeFormatter DATE_FORMATTER = new DateTimeFormatterBuilder()
     .appendYear(4,4).appendMonthOfYear(2).appendDayOfMonth(2).toFormatter();
...
Date date = DATE_FORMATTER.parseDateTime(dateOfBirth).toDate();

If we are talking about your math functions, the first thing to point out is that there were bugs in your math code that I've fixed. That's the problem with doing by hand. That said, the ones that process the string once will be the fastest. A quick test run shows that:

year = Integer.parseInt(dateString.substring(0, 4));
month = Integer.parseInt(dateString.substring(4, 6));
day = Integer.parseInt(dateString.substring(6));

Takes ~800ms while:

int date = Integer.parseInt(dateString);
year = date / 10000;
month = (date % 10000) / 100; 
day = date % 100;
total += year + month + day;

Takes ~400ms.

However ... again... you need to take into account that this is after 10 million iterations. This is a perfect example of premature optimization. I'd choose the one that is the most readable and the easiest to maintain. That's why the Joda time answer is the best.

Solution 2

SimpleDateFormat format = new SimpleDateFormat("yyyyMMdd");
Date date = format.parse("20120405");

Solution 3

I did a quick benchmark test where both methods were executed 1 million times each. The results clearly show that the modulo method is much faster, as Dilum Ranatunga predicted.

t.startTiming();
for(int i=0;i<1000000;i++) {
    int year = Integer.parseInt(dateString.substring(0, 4));
    int month = Integer.parseInt(dateString.substring(4, 6));
    int day = Integer.parseInt(dateString.substring(6));
}
t.stopTiming();
System.out.println("First method: "+t.getElapsedTime());

Time t2 = new Time();
t2.startTiming();
for(int i=0;i<1000000;i++) {
    int date = Integer.parseInt(dateString);
    int y2 = date / 1000;
    int m2 = (date % 1000) / 100;
    int d2 = date % 10000;
}
t2.stopTiming();
System.out.println("Second method: "+t2.getElapsedTime());

The results don't lie (in ms).

First method: 129
Second method: 53

Solution 4

The second will certainly be faster, once you change mod to % and add missing semicolons and fix the divisor in the year calculation. That said, I'm finding it hard to picture the application where this is a bottleneck. Just how many times are you parsing YYYYMMdd dates into their components, without any need to validate them?

Solution 5

How about (but it would parse an invalid date without saying anything...):

public static void main(String[] args) throws Exception {
    char zero = '0';
    int yearZero = zero * 1111;
    int monthAndDayZero = zero * 11;
    String s = "20120405";
    int year = s.charAt(0) * 1000 + s.charAt(1) * 100 + s.charAt(2) * 10 + s.charAt(3) - yearZero;
    int month = s.charAt(4) * 10 + s.charAt(5) - monthAndDayZero;
    int day = s.charAt(6) * 10 + s.charAt(7) - monthAndDayZero;
}

Doing a quick and dirty benchmark with 100,000 iterations warm up and 10,000,000 timed iterations, I get:

  • 700ms for your first method
  • 350ms for your second method
  • 10ms with my method.
Share:
30,306
user3001
Author by

user3001

Updated on December 14, 2020

Comments

  • user3001
    user3001 over 3 years

    When parsing a YYYYMMdd date, e.g. 20120405 for 5th April 2012, what is the fastest method?

    int year = Integer.parseInt(dateString.substring(0, 4));
    int month = Integer.parseInt(dateString.substring(4, 6));
    int day = Integer.parseInt(dateString.substring(6));
    

    vs.

    int date = Integer.parseInt(dateString)
    year = date / 10000;
    month = (date % 10000) / 100; 
    day = date % 100;
    

    mod 10000 for month would be because mod 10000 results in MMdd and the result / 100 is MM

    In the first example we do 3 String operations and 3 "parse to int", in the second example we do many things via modulo.

    What is faster? Is there an even faster method?

  • Louis Wasserman
    Louis Wasserman about 12 years
    +1 for the right way to do this.
  • Java Drinker
    Java Drinker about 12 years
    This is the way to go, parsing a date string should require performance optimization unless you've determined that you're doing this like >10 million times in a loop for every request or some such... (In which case, you should wonder why).
  • assylias
    assylias about 12 years
    @nim not sure what you mean - year is 2012 after the calculation.
  • TestEngineer
    TestEngineer about 12 years
    This is a classic example of know your tools.
  • user3001
    user3001 about 12 years
    The Java date API is often too slow.
  • Stephen C
    Stephen C about 12 years
    +1 - for pointing out that that the OP is probably wasting his time looking for the fastest solution.
  • Nim
    Nim about 12 years
    Ignore my comment, I didn't see the adjustment yearZero etc..
  • Stephen C
    Stephen C about 12 years
    +1 - for pointing out the flaws in the OP's approach. I just hope that the OP understands ...
  • TestEngineer
    TestEngineer about 12 years
    @user3001 Out of curiosity, when have you found it too slow? It's not the best designed API (understatement) but, I've used it for years without performance issues.
  • Alderath
    Alderath about 12 years
    In almost all normal situations I would prefer the modulo solution posted by the OP, even if this is faster. Why? Because you grasp what is happening in a few seconds when seeing that code. Your code is a little bit more clever, but therefore also takes more time to understand, which is a disadvantage. And I doubt there are many situations where date conversion is the performance bottleneck.
  • assylias
    assylias about 12 years
    @Alderath Completely agree - I would never include what I posted in my code! But it does answer the question!
  • user3001
    user3001 about 12 years
    Take a look at the GregorianCalendar mess. I wrote my own Date class and it is over 300 times faster if I remember the numbers correctly :) So I try to avoid the rest of the api, too, because it might be equally worse performant.
  • ytoledano
    ytoledano almost 9 years
    This is quite slow and it's noticeable in loops far smaller than 10M
  • Gray
    Gray over 7 years
    Also, SimpleDateFormat is not reentrant so this won't work if it is used by multiple threads unless you create a ThreadLocal<SimpleDateFormat>
  • user3001
    user3001 over 7 years
    I know this is old, but I corrected the question to avoid confusion.