# Check if any (all) character of a string is in a given range

10,651

## Solution 1

You can speed up the check by using a `set` (`O(1)` contains check), especially if you are checking multiple strings for the same range since the initial set creation requires one iteration as well. You can then use `all` for the early-breaking iteration pattern which fits better than `any` here:

``````import string

ascii = set(string.ascii_uppercase)
ascii_all = set(string.ascii_uppercase + string.ascii_lowercase)

if all(x in ascii for x in my_string1):
# my_string1 is all ascii
``````

Of course, any `all` construct can be transformed to an `any` via DeMorgan's Law:

``````if not any(x not in ascii for x in my_string1):
# my_string1 is all ascii
``````

## Update:

One good pure set based approach not requiring a complete iteration as pointed out by Artyer:

``````if ascii.issuperset(my_string1):
# my_string1 is all ascii
``````

## Solution 2

Another way just as @schwobaseggl suggest but using full set methods:

``````import string
ascii = string.ascii_uppercase + string.ascii_lowercase
if set(my_string).issubset(ascii):
#myString is ascii
``````

## Solution 3

`re` appears to be quite fast:

``````import re

# to check whether any outside ranges (->MatchObject) / all in ranges (->None)
nonletter = re.compile('[^a-zA-Z]').search

# to check whether any in ranges (->MatchObject) / all outside ranges (->None)
letter = re.compile('[a-zA-Z]').search

bool(nonletter(myString1))
# True

bool(nonletter(myString2))
# True

bool(nonletter(myString2[:-1]))
# False
``````

Benchmarks for OP's two examples and a positive one (set is @schwobaseggl setset is @DanielSanchez):

``````Австрия
re               0.48832818 ± 0.09022105 µs
set              0.58745548 ± 0.01759877 µs
setset           0.81759223 ± 0.03595184 µs
AustriЯ
re               0.51960442 ± 0.01881561 µs
set              1.03043942 ± 0.02453405 µs
setset           0.54060076 ± 0.01505265 µs
tralala
re               0.27832978 ± 0.01462306 µs
set              0.88285526 ± 0.03792728 µs
setset           0.43238688 ± 0.01847240 µs
``````

Benchmark code:

``````import types
from timeit import timeit
import re
import string
import numpy as np

def mnsd(trials):
return '{:1.8f} \u00b1 {:10.8f} \u00b5s'.format(np.mean(trials), np.std(trials))

nonletter = re.compile('[^a-zA-Z]').search
letterset = set(string.ascii_letters)

def f_re(stri):
return not nonletter(stri)

def f_set(stri):
return all(x in letterset for x in stri)

def f_setset(stri):
return set(stri).issubset(letterset)

for stri in ('Австрия', 'AustriЯ', 'tralala'):
ref = f_re(stri)
print(stri)
for name, func in list(globals().items()):
if not name.startswith('f_') or not isinstance(func, types.FunctionType):
continue
try:
assert ref == func(stri)
print("{:16s}".format(name[2:]), mnsd([timeit(
'f(stri)', globals={'f':func, 'stri':stri}, number=1000) * 1000 for i in range(1000)]))

except:
print("{:16s} apparently failed".format(name[2:]))
``````

## Solution 4

There's no way to avoid iterating. However, you can certainly make it more efficient by doing `not 65 <= ord(s) <= 91` rather than comparing against a range.

Share:
10,651
Author by

### Mikhail_Sam

I'm proud to get my first Tag Badge: 221th bronze matlab!

Updated on June 15, 2022

• Mikhail_Sam over 1 year

I have a string containing unicode symbols (cyrillic):

``````myString1 = 'Австрия'
myString2 = 'AustriЯ'
``````

I want to check if all the elements in the string are English (ASCII). Now I'm using a loop:

``````for char in myString1:
if ord(s) not in range(65,91):
break
``````

So if I find the first non-English element I break the loop. But for the given example you can see the string can contain a lot of English symbols and unicode at the end. In this way I will check the whole string. Furthermore, If all the string is in English I still check every char.

Is there any more efficient way to do this? I'm thinking about something like:

``````if any(myString[:]) is not in range(65,91)
``````