regex to find email address from a String

33,496

Solution 1

The correct code is

Pattern p = Pattern.compile("\\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}\\b",
    Pattern.CASE_INSENSITIVE);
Matcher matcher = p.matcher(input);
Set<String> emails = new HashSet<String>();
while(matcher.find()) {
  emails.add(matcher.group());
}

This will give the list of mail address in your long text / html input.

Solution 2

You need something like this regex:

".*(\\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}\\b).*"

When it matches, you can extract the first group and that will be your email.

String regex = ".*(\\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}\\b).*";
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("your text here");
if (m.matches()) {
    String email = m.group(1);
    //do somethinfg with your email
}

Solution 3

This is a simple way to extract all emails from input String using Patterns.EMAIL_ADDRESS:

    public static List<String> getEmails(@NonNull String input) {
        List<String> emails = new ArrayList<>();
        Matcher matcher = Patterns.EMAIL_ADDRESS.matcher(input);
        while (matcher.find()) {
            int matchStart = matcher.start(0);
            int matchEnd = matcher.end(0);
            emails.add(input.substring(matchStart, matchEnd));
        }
        return emails;
    }
Share:
33,496
Neeraj
Author by

Neeraj

&lt;3Computer vision &lt;3Machine learning.

Updated on March 28, 2020

Comments

  • Neeraj
    Neeraj about 4 years

    My intention is to get email address from a web page. I have the page source. I am reading the page source line by line. Now I want to get email address from the current line I am reading. This current line may or may not have email. I saw a lot of regexp examples. But most of them are for validating email address. I want to get the email address from a page source not validate. It should work as http://emailx.discoveryvip.com/ is working

    Some examples input lines are :

    1)<p>Send details to <a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;%72%65%62%65%6b%61%68@%68%61%63%6b%73%75%72%66%65%72.%63%6f%6d">[email protected]</a></p>
    
    2)<p>Interested should send details directly to <a href="http://www.abcdef.com/abcdef/">www.abcdef.com/abcdef/</a>. Should you have any questions, please email <a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;%6a%6f%62%73@%72%65%6c%61%79.%65%64%75">[email protected]</a>.
    
    3)Note :- Send your queries at  [email protected]  for more details call Mr. neeraj 012345678901.
    

    I want to get [email protected] from examples 1,2 and 3. I am using java and I am not good in rexexp. Help me.

  • Stunner
    Stunner over 10 years
    How to get only first matched text
  • Juha Palomäki
    Juha Palomäki about 7 years
    This does not take into account domain names which have more than two parts, for example in UK you have addresses like [email protected]. Also nowadays you have bunch of new TLDs that are longer than 4 characters.
  • LarsH
    LarsH almost 6 years
    The regexp won't allow lowercase letters, unless you compile it using CASE_INSENSITIVE. As it is, it will not match most email addresses.