How to match accented characters with a regex?
32,510
Instead of \w
, use the POSIX bracket expression [:alpha:]
:
"blåbær dèjá vu".scan /[[:alpha:]]+/ # => ["blåbær", "dèjá", "vu"]
"blåbær dèjá vu".scan /\w+/ # => ["bl", "b", "r", "d", "j", "vu"]
In your particular case, change the regex to this:
NAME_REGEX = /^[[:alpha:]\s'"\-_&@!?()\[\]-]*$/u
This does match much more than just accented characters, though. Which is a good thing. Make sure you read this blog entry about common misconceptions regarding names in software applications.
Author by
user502052
Updated on July 10, 2022Comments
-
user502052 almost 2 years
I am running Ruby on Rails 3.0.10 and Ruby 1.9.2. I am using the following Regex in order to match names:
NAME_REGEX = /^[\w\s'"\-_&@!?()\[\]-]*$/u validates :name, :presence => true, :format => { :with => NAME_REGEX, :message => "format is invalid" }
However, if I try to save some words like the followings:
Oilalà Pì Rùby ... # In few words, those with accented characters
I have a validation error
"Name format is invalid.
.How can I change the above Regex so to match also accented characters like
à
,è
,é
,ì
,ò
,ù
, ...?