How can I extract video ID from YouTube's link in Python?
20,212
Solution 1
Python has a library for parsing URLs.
import urlparse
url_data = urlparse.urlparse("http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1")
query = urlparse.parse_qs(url_data.query)
video = query["v"][0]
Solution 2
I've created youtube id parser without regexp:
import urlparse
def video_id(value):
"""
Examples:
- http://youtu.be/SA2iWivDJiE
- http://www.youtube.com/watch?v=_oPAwA_Udwc&feature=feedu
- http://www.youtube.com/embed/SA2iWivDJiE
- http://www.youtube.com/v/SA2iWivDJiE?version=3&hl=en_US
"""
query = urlparse.urlparse(value)
if query.hostname == 'youtu.be':
return query.path[1:]
if query.hostname in ('www.youtube.com', 'youtube.com'):
if query.path == '/watch':
p = urlparse.parse_qs(query.query)
return p['v'][0]
if query.path[:7] == '/embed/':
return query.path.split('/')[2]
if query.path[:3] == '/v/':
return query.path.split('/')[2]
# fail?
return None
Solution 3
This is the Python3 version of Mikhail Kashkin's solution with added scenarios.
from urllib.parse import urlparse, parse_qs
from contextlib import suppress
# noinspection PyTypeChecker
def get_yt_id(url, ignore_playlist=False):
# Examples:
# - http://youtu.be/SA2iWivDJiE
# - http://www.youtube.com/watch?v=_oPAwA_Udwc&feature=feedu
# - http://www.youtube.com/embed/SA2iWivDJiE
# - http://www.youtube.com/v/SA2iWivDJiE?version=3&hl=en_US
query = urlparse(url)
if query.hostname == 'youtu.be': return query.path[1:]
if query.hostname in {'www.youtube.com', 'youtube.com', 'music.youtube.com'}:
if not ignore_playlist:
# use case: get playlist id not current video in playlist
with suppress(KeyError):
return parse_qs(query.query)['list'][0]
if query.path == '/watch': return parse_qs(query.query)['v'][0]
if query.path[:7] == '/watch/': return query.path.split('/')[1]
if query.path[:7] == '/embed/': return query.path.split('/')[2]
if query.path[:3] == '/v/': return query.path.split('/')[2]
# returns None for invalid YouTube url
Solution 4
Here is RegExp it cover these cases
((?<=(v|V)/)|(?<=be/)|(?<=(\?|\&)v=)|(?<=embed/))([\w-]+)
Solution 5
match = re.search(r"youtube\.com/.*v=([^&]*)", "http://www.youtube.com/watch?v=z_AbfPXTKms&test=123")
if match:
result = match.group(1)
else:
result = ""
Untested.
Author by
decarbo
Updated on January 05, 2022Comments
-
decarbo over 2 years
I know this can be easily done using PHP's
parse_url
andparse_str
functions:$subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1"; $url = parse_url($subject); parse_str($url['query'], $query); var_dump($query);
But how to achieve this using Python? I can do
urlparse
but what next?