Java - Regex for Full Name

49,296

Solution 1

This problem of identifying names is very culture-centric, and it really has no hope of working reliably. I would really recommend against this, as there is NO canonical form for what makes up a person's name in any country, anywhere on Earth that I know of. I could legally change my name to #&*∫Ω∆ Smith, and that's not going to fit into anyone's algorithm. It's not that this specific example is something that a lot of people do, but many people don't think outside of the ASCII table when considering data input, and that's going to lead to problems.

You can argue against the probability of this happening, but in a global world, it's increasingly unlikely that ALL of your users are going to have English-transliterated spellings for their names. It's also very possible that you'll have users from cultures that don't have a concept of first/last name. Don't assume that, even if your application is only running in a given country, that some of your users won't be from other places (people move from country to country all the time, and some of them might want to use your software).

Protect your app against SQL injection for fields such as this (if you're storing these in a DB), and leave it at that.

Solution 2

This method validate the name and return false if the name has nothing or has numbers or special characters:

public static boolean isFullname(String str) {
    String expression = "^[a-zA-Z\\s]+"; 
    return str.matches(expression);        
}

Solution 3

What Was Asked For

Instead of "^[a-zA-Z][ ]*$" you want "^[a-zA-Z ]*$". There are some answers that reference \s but you don't want those because they give you other white space like tabs.

Additional Common Examples

On a side note, there are first and last names that contain hypens, like Mary-Ann for a first name or a hyphenated last name like Jones-Garcia. There are also last names that have periods, like St. Marc. Lastly, you have ' in some last names like O'Donnel.

Side Note

Legally, you could change your name to be Little Bobby Drop Tables... or include other random characters... but I'm not sure how many systems really accommodate for stuff like that.

If you want the general case (world wide), then don't limit the fields by any character type, as you can have names in Greek Letters, Cyrillic Letters, Chinese letters, etc. There are also non English accent characters and other things like the german umlaut.

If you are worried about SQL injection, use parameterized queries instead of dynamic queries.

Suggested Solution

If you are only worried about English letters, and want to use a regular expression that handles the sample cases above, then you could use "^[a-zA-Z \-\.\']*$"

Solution 4

[A-Z] match a single capital letter (first letter of name)

[a-z]* match any number of small letters (other letters or name)

(\s) match 1 whitespace character (the space between names)

+ one or more of the previous expression (to match more than one name)

all together:

- matches first names / lastname -
^([A-Z][a-z]*((\s)))+[A-Z][a-z]*$

or to match names like DiMaggio St. Croix, O'Reilly and Le-Pew. You can add similar characters like the 'ᶜ' in MᶜKinley as you remember them or come across people with those less common characters in their name

^([A-z\'\.-ᶜ]*(\s))+[A-z\'\.-ᶜ]*$

Share:
49,296
Nicholas Lie
Author by

Nicholas Lie

Updated on March 09, 2020

Comments

  • Nicholas Lie
    Nicholas Lie about 4 years

    How can I validate regex for full name? I only want alphabets (no numericals) and only spaces for the regex. This is what I have done so far. Would you please help me fix the regex? Thank you very much

    public static boolean isFullname(String str) {
        boolean isValid = false;
        String expression = "^[a-zA-Z][ ]*$"; //I know this one is wrong for sure >,<
        CharSequence inputStr = str;
        Pattern pattern = Pattern.compile(expression, Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(inputStr);
        if (matcher.matches()) {
            isValid = true;
        }
        return isValid;
    }
    
  • paxdiablo
    paxdiablo over 12 years
    Joe DiMaggio may have something to say about that :-)
  • James Webster
    James Webster over 12 years
    @NullUser, that supports middle names.
  • James Webster
    James Webster over 12 years
    @pax, Mr DiMaggio may well have a problem. I'll revise
  • NullUserException
    NullUserException over 12 years
    Why are you using ((\s){1}) instead of just \s?
  • James Webster
    James Webster over 12 years
    Yeah, wasn't sure on that bit. I even mentioned that in my answer.
  • tchrist
    tchrist over 12 years
    Fails on “Pres. William MᶜKinley”
  • tchrist
    tchrist over 12 years
    False positive on the empty string. False negative on “Dominque Strauss‐Kahn”.
  • Aillyn
    Aillyn over 12 years
    @tchrist Why do you assume everyone has a name?
  • NullUserException
    NullUserException over 12 years
    There's no need to use ^ and $ with .matches()
  • tchrist
    tchrist over 12 years
    Fails on “John Paul Jones”. Fails on “Renée Fleming”. Fails on “Dominque Strauss‐Kahn”. Fails on “King Henry Ⅷ”. Fails on “Tim O’Reilly”. Fails on “Secretary Federico Peña”. Fails on “Motel 6”. Fails on “Cher”. Fails on “Antonio Cipriano José María y Francisco de Santa Ana Machado y Ruiz”. Fails fails fails fails fails fails fails fails fails.
  • tchrist
    tchrist over 12 years
    False positive on “___^^^_^\\\\[[[]][][____”.
  • tchrist
    tchrist over 12 years
    False positive on “\n\f\f\t”.
  • Nicholas Lie
    Nicholas Lie over 12 years
    ^ it's really bothering me actually >,< but sadly it's true that “___^^^_^\\\[[[]][][____” can pass
  • Nicholas Lie
    Nicholas Lie over 12 years
    (came to the conclusion that full names cannot be validated at all - just thought Chinese,Russian,Japanese,Arabic and other unicodes >,<) Thanks everyone for the enlightenment :)
  • James Oravec
    James Oravec over 8 years
    ... I could legally change my name to #&*∫Ω∆ Smith, and that's not going to fit into anyone's algorithm... how about ^.*$ :)