# Fastest method to generate big random string with lower Latin letters

12,975

## Solution 1

Here's Python 3 code that generates 1000000 "random" lowercase letters in `0.28` seconds (see also `0.11`-seconds solution at the end; @Ashwini Chaudhary's code from the question takes `0.55` seconds on my machine, @Markku K.'s code -- `0.53`):

``````#!/usr/bin/env python3
import os
import sys

def write_random_lowercase(n):
min_lc = ord(b'a')
len_lc = 26
ba = bytearray(os.urandom(n))
for i, b in enumerate(ba):
ba[i] = min_lc + b % len_lc # convert 0..255 to 97..122
sys.stdout.buffer.write(ba)

write_random_lowercase(1000000)
``````

`% len_lc` skews the distribution (see at the end on how to fix it) though It still satisfies the conditions (ascii, lowercase, frequencies of 1, 2, 3 letter sequences):

``````\$ python3 generate-random.py | python3 check-seq.py
``````

where `check-seq.py`:

``````#!/usr/bin/env python3
import sys
from collections import Counter
from string import ascii_lowercase

def main():
limits = [40000, 2000, 100]

s = sys.stdin.buffer.readline() # a single line
assert 1000000 <= len(s) <= 1000002 # check length +/- newline
s.decode('ascii','strict') # check ascii
assert set(s) == set(ascii_lowercase.encode('ascii')) # check lowercase

for n, lim in enumerate(limits, start=1):
freq = Counter(tuple(s[i:i+n]) for i in range(len(s)))
assert max(freq.values()) <= lim, freq

main()
``````

Note: on acm.timus.ru `generate-random.py` gives "Output limit exceeded".

To improve performance, you could use `bytes.translate()` method (`0.11` seconds):

``````#!/usr/bin/env python3
import os
import sys

# make translation table from 0..255 to 97..122
tbl = bytes.maketrans(bytearray(range(256)),
bytearray([ord(b'a') + b % 26 for b in range(256)]))
# generate random bytes and translate them to lowercase ascii
sys.stdout.buffer.write(os.urandom(1000000).translate(tbl))
``````

## How to fix `% len_lc` skew

`256` (number of bytes) is not evenly divisible by `26` (number of lower Latin letters) therefore the formula `min_lc + b % len_lc` makes some values appear less often than others e.g.:

``````#!/usr/bin/env python3
"""Find out skew: x = 97 + y % 26 where y is uniform from [0, 256) range."""
from collections import Counter, defaultdict

def find_skew(random_bytes):
char2freq = Counter(chr(ord(b'a') + b % 26) for b in random_bytes)
freq2char = defaultdict(set)
for char, freq in char2freq.items():
return {f: ''.join(sorted(c)) for f, c in freq2char.items()}

print(find_skew(range(256)))
# -> {9: 'wxyz', 10: 'abcdefghijklmnopqrstuv'}
``````

Here, the input `range(256)` is uniformly distributed (each byte occurs exactly once) but `'wxyz'` letters in the output are less often then the rest `9` vs. `10` occurrences. To fix it, unaligned bytes could be dropped:

``````print(find_skew(range(256 - (256 % 26))))
# -> {9: 'abcdefghijklmnopqrstuvwxyz'}
``````

Here, the input is uniformly distributed bytes in the range `[0, 234)` the output is uniformly distributed ascii lowercase letters.

`bytes.translate()` accepts the second argument to specify bytes to delete:

``````#!/usr/bin/env python3
import os
import sys

nbytes = 256
nletters = 26
naligned = nbytes - (nbytes % nletters)
tbl = bytes.maketrans(bytearray(range(naligned)),
bytearray([ord(b'a') + b % nletters
for b in range(naligned)]))
bytes2delete = bytearray(range(naligned, nbytes))
R = lambda n: os.urandom(n).translate(tbl, bytes2delete)

def write_random_ascii_lowercase_letters(write, n):
"""*write* *n* random ascii lowercase letters."""
while n > 0:
# R(n) expected to drop `(nbytes - nletters) / nbytes` bytes
# to compensate, increase the initial size
n -= write(memoryview(R(n * nbytes // naligned + 1))[:n])

write = sys.stdout.buffer.write
write_random_ascii_lowercase_letters(write, 1000000)
``````

If the random generator (`os.urandom` here) produces long sequences of the bytes that are outside of the aligned range (`>=234`) then the `while` loop may execute many times.

The time performance can be improved by another order of magnitude if `random.getrandbits(8*n).to_bytes(n, 'big')` is used instead of `os.urandom(n)`. The former uses Mersenne Twister as the core generator that may be faster than `os.urandom()` that uses sources provided by the operating system. The latter is more secure if you use the random string for secrets.

## Solution 2

Use `string.ascii_lowercase` instead of `chr` to generate lowercase charaters:

``````from sys import stdin
from random import choice
from string import ascii_lowercase

s = ''.join([choice(ascii_lowercase) for _ in range(1000000)])
stdout.write(s)
``````

Also writing to `stdout` directly appears to be faster, encoding yourself in python is not faster than having it all handled in the C code.

I also use a list comprehension; `str.join()` needs to scan through the input sequence twice, once to determine the length of the output, once to actually copy the input elements to output string. A list comprehension then beats out the slower generator-to-list code.

Just using `choice(ascii_lowercase)` over your method of generating each character from an integer is over twice as fast:

``````>>> timeit.timeit('f()', 'from __main__ import yours as f', number=3)
11.299837955011753
>>> timeit.timeit('f()', 'from __main__ import mine as f', number=3)
5.330044150992762
``````

You could try and avoid the `''.join()` overhead by writing individual characters directly to `stdout`:

``````from sys import stdout
from random import choice
from string import ascii_lowercase

for _ in range(1000000):
stdout.write(choice(ascii_lowercase))
``````

Next to try is to write raw bytes:

``````from sys import stdout
from random import choice
from string import ascii_lowercase
bal = [c.encode('ascii') for c in ascii_lowercase]
out = stdout.buffer

for _ in range(1000000):
out.write(choice(bal))
``````

but these are no improvements over `''.join()` in my tests.

Next we move to encoding the ASCII characters to bytes once, then using `bytes.join()`:

``````from sys import stdout
from random import choice
from string import ascii_lowercase

bal = [c.encode('ascii') for c in ascii_lowercase]
stdout.buffer.write(b''.join([choice(bal) for _ in range(1000000)]))
``````

`bal` is a list of lowercase ASCII characters encoded to bytes, from which we random pick 1 million items, join them to into a large byte string then write that in one go to the binary stdout buffer.

The bytes join is just as 'slow' as the string version:

``````>>> timeit.timeit('f()', 'from __main__ import bytes as f', number=3)
5.41390264898655
``````

but we encode 26 characters, not 1 million so the write stage is faster.

## Solution 3

My solution which just got accepted (python 2.7, Execution time: 0.984):

``````from random import choice
from string import ascii_lowercase

lis = list(ascii_lowercase)
print ''.join(choice(lis) for _ in xrange(1000000))
``````

Accessing elements of a list is faster is than for strings.

``````In [13]: from random import choice

In [14]: from string import ascii_lowercase

In [15]: lis = list(ascii_lowercase)

In [16]: %timeit ''.join(choice(lis) for _ in xrange(10**5))
1 loops, best of 3: 128 ms per loop

In [17]: %timeit ''.join(choice(ascii_lowercase) for _ in xrange(10**5))
1 loops, best of 3: 134 ms per loop
``````

And you don't need `stdout` or `stdin` here as most online judges us something like this to test your script:

``````\$python script.py <in.txt >out.txt
``````

So you can use `print` instead of `stdout` and `raw_input()` instead of `stdin`, though for huge inputs `stdin.readline` is faster than `raw_input()`.

Update 1:

Using @Markku's tip execution time was reduced to .64 in py2.7:

``````from random import random
from string import ascii_lowercase

lis = list(ascii_lowercase)
print "".join( [lis[int(random() * 26)] for _ in xrange(1000000)] )
``````

## Solution 4

I get a huge speed improvement by changing from randint(0,25) to int(random()*25) in your original solution. On my machine, the time went from about 2 seconds, to about 0.6 seconds. If you take a look at the random.py code, you will see that randint is full of checks that you don't want or need.

update: Oops, off by one. You need int(random()*26). Thanks Ashwini

## Solution 5

Try turning some part of it into C++ or another compiled language. That will almost guaranteed make it faster. Python, unfortunately, isn't too fast, especially when it comes to things like this. Try C++, C, or Pascal.

EDIT: Also see the Python Performance Tips

Share:
12,975

Author by

### ilalex

Updated on September 16, 2022

• ilalex 5 months

I'm trying to solve this problem from Timus Online Judge. To solve this problem you need generate a sequence of 1 000 000 lowercase Latin letters and write it to stdin in 1 second.

It is easy to solve this problem with C++ or Java. I have python solution here:

``````import os
from random import randint

s = ''.join(chr(97 + randint(0, 25)) for i in range(1000000))
os.write(1, bytes(s, 'utf8'))
``````

It takes 1.7s:

``````\$ time python3.3 1219.py > /dev/null

real    0m1.756s
user    0m1.744s
sys     0m0.008s
``````

And I got "Time limit exceeded" in result. So the question is "How to do it faster?"

UPD1: Using `randint(97, 122)` reduces time at 16ms. Now it is 1.740s

UPD2: Solution by @Martijn Pieters takes 0.979s, but it doesn't pass test either.

UPD3 Martijn Pieters suggested a very good solutions, but it's still slow:

``````from sys import stdin
from random import choice
from string import ascii_lowercase

s = ''.join([choice(ascii_lowercase) for _ in range(1000000)])
stdout.write(s)
``````

Takes 0.924s

``````from sys import stdout
from random import choice
from string import ascii_lowercase

for _ in range(1000000):
stdout.write(choice(ascii_lowercase))
``````

Takes 1.173s

``````from sys import stdout
from random import choice
from string import ascii_lowercase
bal = [c.encode('ascii') for c in ascii_lowercase]
out = stdout.buffer

for _ in range(1000000):
out.write(choice(bal))
``````

Takes 1.155s

``````from sys import stdout
from random import choice
from string import ascii_lowercase

bal = [c.encode('ascii') for c in ascii_lowercase]
stdout.buffer.write(b''.join([choice(bal) for _ in range(1000000)]))
``````

Takes 0.901s

UPD4

Some guy just solved problem on Timus. I hope he will share his solution :)

UPD5 Thanks to Ashwini Chaudhary for sharing his Python 2.x solution with us:

``````from random import choice
from string import ascii_lowercase
lis=list(ascii_lowercase)
print ''.join(choice(lis) for _ in xrange(1000000))
``````

It takes 0.527s on my computer and it passes tests on Timus. But problem with Python3.x still remains.

UPD6 Thanks to Markku K. this code:

``````import os
from random import random
from string import ascii_lowercase

bal = [c.encode('ascii') for c in ascii_lowercase]
os.write(1, b''.join([bal[int(random() * 26)] for _ in range(1000000)]))
``````

Takes 0.445s, but still didn't pass the test

• Ashwini Chaudhary
• Fred Foo
`randint(97, 122)` might be a small timesaver over `97 + randint(0, 25)`. Even addition isn't cheap in Python because it involves typechecks.
• mgilson
Use a list-comprehension instead of a generator expression. That can sometimes save a bit when using `join`. (and `join` turns your generator into a list or a tuple anyway).
• ilalex almost 10 years
I can do it in C++. I want to know: is there way to do it Python?
• kirbyfan64sos almost 10 years
@ilalex: See the Python Performance Tips.
• ilalex almost 10 years
How to use list comprehension if I need a string to write in stdout?
• Martijn Pieters almost 10 years
@ilalex: `stdout` you mean? It encodes the unicode string to bytes for you depending on the output encoding. In this case you only generate ASCII, so it's fine.
• ilalex almost 10 years
I'm trying to use list comprehension in that way: for a in [choice(ascii_lowercase) for _ in range(1000000)]: stdout.write(a) But it takes more time than your code above.
• Martijn Pieters almost 10 years
@ilalex: use `for _ in range(1000000): stdout.write(choice(ascii_lowercase))` instead.
• ilalex almost 10 years
Avoiding ''.join() is slower.
• Martijn Pieters almost 10 years
@ilalex: Latest version uses `bytes.join()` instead.
• ilalex almost 10 years
Thank you! The latest is fastest.
• ilalex almost 10 years
I still have not solved this problem in python3.3, but I learned a lot thanks to you.
• kirbyfan64sos almost 10 years
I can't see why this woudn't work in Python 3, other than the print statement.
• ilalex almost 10 years
I wrote them a letter about this issue and they have confirmed that they have bug with definition OLE in Python.
• cfi almost 10 years
The fastest solution does exactly that: Moving as much functionality from explicit, interpreted code into Python's builtins and the stdlib - using `bytearray`, it's feature to internally call arbitrary constructors (again passing a builtin), and `maketrans`. J.F.'s trick is to not have a single loop coded in python but in C in Python's internals.
• cfi almost 10 years
Not a criticism, just a general comment: The trick with speed in any interpreted language is to move as much control logic as possible into builtins instead of using interpreted code. Note how J.F.'s fastest solution does not have a single loop implemented in Python in the timing critical code: `os.urandom` does the mem allocation, and random number generation and `str.translate` iterates over the numbers, transcribing them into the wanted output format (latin lowercase chars). The end result is similar to what kirbyfan64sos proposed: Write your code in C. I'd say: Know your stdlib! :-)
• jfs almost 10 years
@cfi: It is true bitwise operations on many bytes in Python code are much slower (x100-200 times) on CPython compared to C. Note: the fast implementation in Pypy, Jython, IronPython might look different.
• Charalamm almost 2 years
the other answers return am object or a file with the data outputed and also show the time it took to run the code