getting the youtube id from a link

java regex parsing url youtube

20,329

Solution 1

Tried the other ones but failed in my case - adjusted the regex to fit for my urls

String pattern = "(?<=watch\\?v=|/videos/|embed\\/)[^#\\&\\?]*";
    
    Pattern compiledPattern = Pattern.compile(pattern);
    Matcher matcher = compiledPattern.matcher(url);

    if(matcher.find()){
        return matcher.group();
    }

This one works for: (you could also implement a security check youtubeid length = 11 )

http://www.youtube.com/embed/Woq5iX9XQhA?html5=1

http://www.youtube.com/watch?v=384IUU43bfQ

http://gdata.youtube.com/feeds/api/videos/xTmi7zzUa-M&whatever

Woq5iX9XQhA

384IUU43bfQ

xTmi7zzUa-M

Solution 2

public static String getYoutubeVideoId(String youtubeUrl)
 {
 String video_id="";
  if (youtubeUrl != null && youtubeUrl.trim().length() > 0 && youtubeUrl.startsWith("http"))
 {

String expression = "^.*((youtu.be"+ "\\/)" + "|(v\\/)|(\\/u\\/w\\/)|(embed\\/)|(watch\\?))\\??v?=?([^#\\&\\?]*).*"; // var regExp = /^.*((youtu.be\/)|(v\/)|(\/u\/\w\/)|(embed\/)|(watch\?))\??v?=?([^#\&\?]*).*/;
 CharSequence input = youtubeUrl;
 Pattern pattern = Pattern.compile(expression,Pattern.CASE_INSENSITIVE);
 Matcher matcher = pattern.matcher(input);
 if (matcher.matches())
 {
String groupIndex1 = matcher.group(7);
 if(groupIndex1!=null && groupIndex1.length()==11)
 video_id = groupIndex1;
 }
 }
 return video_id;
 }

Solution 3

This regex would do the trick:

(?<=videos\/|v=)([\w-]+)

This means that we're first looking for video/ or v= then captures all the following characters that can be in word (letters, digits, and underscores) and hyphens.

Example in java:

public static void main(String[] args) {

    String link = "http://gdata.youtube.com/feeds/api/videos/xTmi7zzUa-M&whatever";
    String pattern = "(?:videos\\/|v=)([\\w-]+)";

    Pattern compiledPattern = Pattern.compile(pattern);
    Matcher matcher = compiledPattern.matcher(link);

    if(matcher.find()){
        System.out.println(matcher.group());
    }
}

Output:

xTmi7zzUa-M

Solution 4

Got a better solution from this link.

Use the following method to get the videoId from the link.

YoutubeHelper.java

import com.google.inject.Singleton; 

import java.util.regex.Matcher;
import java.util.regex.Pattern;

@Singleton 
public class YouTubeHelper { 

    final String youTubeUrlRegEx = "^(https?)?(://)?(www.)?(m.)?((youtube.com)|(youtu.be))/";
    final String[] videoIdRegex = { "\\?vi?=([^&]*)","watch\\?.*v=([^&]*)", "(?:embed|vi?)/([^/?]*)", "^([A-Za-z0-9\\-]*)"};

    public String extractVideoIdFromUrl(String url) {
        String youTubeLinkWithoutProtocolAndDomain = youTubeLinkWithoutProtocolAndDomain(url);

        for(String regex : videoIdRegex) {
            Pattern compiledPattern = Pattern.compile(regex);
            Matcher matcher = compiledPattern.matcher(youTubeLinkWithoutProtocolAndDomain);

            if(matcher.find()){
                return matcher.group(1);
            } 
        } 

        return null; 
    } 

    private String youTubeLinkWithoutProtocolAndDomain(String url) {
        Pattern compiledPattern = Pattern.compile(youTubeUrlRegEx);
        Matcher matcher = compiledPattern.matcher(url);

        if(matcher.find()){
            return url.replace(matcher.group(), "");
        } 
        return url;
    } 
}

Hope this helps.

Solution 5

That pattern worked for me:

"http(?:s?)://(?:www\.)?youtu(?:be\.com/watch\?v=|\.be/)([\w\-]+)(&(amp;)?[\w\?=‌]*)?"

source: Regular expression for youtube links

View more solutions

20,329

Peril

Updated on May 30, 2020

Comments

Peril almost 4 years
I got this code to get the youtube id from the links like www.youtube.com/watch?v=xxxxxxx
```
  URL youtubeURL = new URL(link);
  youtubeURL.getQuery();
```
basically this will get me the id easily v=xxxxxxxx

but I noticed sometime youtube links will be like this
```
http://gdata.youtube.com/feeds/api/videos/xxxxxx
```
I am getting the links from a feed so do I need to build a regex for that or theres a parser to get that for me ?
- ridgerunner over 12 years
  
  You may want to look at my answer to a very similar question. It extracts the video-id from a variety of YouTube URL formats.
- Peril over 12 years
  
  @ridgerunner thanks but it misses the gdata links
- ridgerunner over 12 years
  
  Thanks for pointing that out. I've updatated my YouTube ID matching expression so that it now correctly matches your gdata subdomain example.
Peril over 12 years

this will not get me the id only
Marcus over 12 years

Using http://gdata.youtube.com/feeds/api/videos/xxxxxx as indata I get xTmi7zzUa-M as output. Did I misread your question and you were asking for a regex that would allow you to parse both v=xxxxxxxx and the other one?
Marcus over 12 years

Well actually. That regex works in both cases. What result do you get?
Peril over 12 years

but the regex you wrote will get me the id and other stuff check here gskinner.com/RegExr
Derek Springer over 12 years

It's the '$' at the end that makes it only capture everything beyond .../videos/. It wouldn't work in the first example if there if there were any other parameters on the link (i.e. v=xxxxxxx&other_param=yyyyy...)
Marcus over 12 years

Salah have a look at my updated answer now. @Derek you're right about that, though that wasn't specified in the question I didn't take that into account.
Code Jockey over 12 years

A lookbehind would be better in certain cases than a non-capturing group. If used instead, the entire match (that is matcher.group()) would be the id the asker is looking for, rather than the first capturing group (matcher.group(1))
Marcus over 12 years

@Jockey you're right about that. I replaced my non capturing group with a positive lookbehind. Ty
Pkmmte almost 11 years

Thank you soooooooooooooooo much!!
kritzikratzi about 10 years

it's not pretty, but here you go: (?<=watch\?v=|/videos/|embed\/|youtu.be\/|\/v\/|watch\?v%3D|‌%2Fvideos%2F|embed%2‌F|youtu.be%2F|%2Fv%2‌F)[^#\&\?\n]*
Code Jockey over 9 years

@krishan indeed not (that wasn't even a valid URL when this question was asked, to my knowledge...) but if you're interested, this should work for the youtu.be as well as /embed/ links: (?<=watch\?v=|/videos/|/embed/|youtu.be/)[^&#?]* (I'll update my answer) - please let me know if it doesn't, as I made that up just now by hand...
Er KK Chopra over 9 years

i got a solution about this my question stackoverflow.com/questions/25718304/…
JPM over 9 years

@Peril This should be the answer please mark it so.