Fastest way for line-by-line reading STDIN?
Looked inside BufferedReader#readLine
source. There're several problems I see:
- It uses StringBuffer instead of StringBuilder, which creates synchronization overhead.
- Also there seems to be data copy overhead - not entirely sure, better check it out.
- Dedicated monitor object in the BufferedReader and even more synchronization overhead.
You may take your chances with two things:
- Writing your own buffering, which could save some time on double copying of the data.
- Writing your own nextLine method that would use StringBuilder and go over source data with simple cycle.
Maarten Kesselaers
Updated on August 10, 2020Comments
-
Maarten Kesselaers over 3 years
I'm looking for the most time-efficient way to read STDIN line-by-line.
The first line is the number of conditions to test. All the following lines are conditions (strings) with a maximum of 100 000 characters.
I have already tried the following (plus result for 4 times 90 000 characters:
Scanner with a while-loop (7255 ms)
Scanner sc = new Scanner(System.in); int numberOfLines = Integer.parseInt(sc.nextLine()); long start = 0; int i = 1; while (i<=numberOfLines){ start = System.currentTimeMillis(); sc.nextLine(); Debug.println((System.currentTimeMillis()-start) + "ms for scanner while"); i++; }
- Results :
- 3228ms for scanner while
- 2264ms for scanner while
- 1309ms for scanner while
- 454ms for scanner while
- Results :
Scanner with a for-loop (7078 ms)
Scanner sc = new Scanner(System.in); int numberOfLines = Integer.parseInt(sc.nextLine()); long start = 0; for (int i = 1; i<= numberOfLines;i++){ start = System.currentTimeMillis(); sc.nextLine(); Debug.println((System.currentTimeMillis()-start) + "ms for scanner for"); //i++; }
- Results :
- 3168ms for scanner for
- 2207ms for scanner for
- 1236ms for scanner for
- 467ms for scanner for
- Results :
BufferedReader with a for-loop (7403 ms)
try { BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); int numberOfLines = Integer.parseInt(br.readLine()); long start = 0; for (int i = 0; i< numberOfLines;i++){ start = System.currentTimeMillis(); br.readLine(); Debug.println((System.currentTimeMillis()-start) + "ms for bufferreader for"); //i++; } } catch (Exception e) { System.err.println("Error:" + e.getMessage());
}
- Results :
- 3273ms for bufferreader for
- 2330ms for bufferreader for
- 1293ms for bufferreader for
- 507ms for bufferreader for
- Results :
BufferedReader with a while-loop (7461 ms)
try { BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); int numberOfLines = Integer.parseInt(br.readLine()); int i=0; long start = 0; while(i< numberOfLines){ start = System.currentTimeMillis(); br.readLine(); Debug.println((System.currentTimeMillis()-start) + "ms for bufferreader while"); i++; } } catch (Exception e) { System.err.println("Error:" + e.getMessage());
}
- Results :
- 3296ms for bufferreader while
- 2358ms for bufferreader while
- 1307ms for bufferreader while
- 500ms for bufferreader while
- Results :
While debugging the time taken, i noticed that the time-taken decreases after each read. Is it possible to restrict the bytes that are initialized (f.e. : If you have a maximum of 100.000 chars, limit the scanner/bufferedreader to only initialize 100 000 chars. After a read it will need to refill itself with the next 100 000 chars)
Any ideas on this matter are more than welcome.
EDIT : Added the code for each scenario along with the time-taken per line read. Also changed 100.000 to 100 000 to read more easily.