python: get all youtube video urls of a channel

36,246

Solution 1

Increase max-results from 1 to however many you want, but beware they don't advise grabbing too many in one call and will limit you at 50 (https://developers.google.com/youtube/2.0/developers_guide_protocol_api_query_parameters).

Instead you could consider grabbing the data down in batches of 25, say, by changing the start-index until none came back.

EDIT: Here's the code for how I would do it

import urllib, json
author = 'Youtube_Username'
foundAll = False
ind = 1
videos = []
while not foundAll:
    inp = urllib.urlopen(r'http://gdata.youtube.com/feeds/api/videos?start-index={0}&max-results=50&alt=json&orderby=published&author={1}'.format( ind, author ) )
    try:
        resp = json.load(inp)
        inp.close()
        returnedVideos = resp['feed']['entry']
        for video in returnedVideos:
            videos.append( video ) 
        ind += 50
        print len( videos )
        if ( len( returnedVideos ) < 50 ):
            foundAll = True
    except:
        #catch the case where the number of videos in the channel is a multiple of 50
        print "error"
        foundAll = True
for video in videos:
    print video['title'] # video title
    print video['link'][0]['href'] #url

Solution 2

After the youtube API change, max k.'s answer does not work. As a replacement, the function below provides a list of the youtube videos in a given channel. Please note that you need an API Key for it to work.

import urllib
import json
def get_all_video_in_channel(channel_id):
    api_key = YOUR API KEY
    base_video_url = 'https://www.youtube.com/watch?v='
    base_search_url = 'https://www.googleapis.com/youtube/v3/search?'
    first_url = base_search_url+'key={}&channelId={}&part=snippet,id&order=date&maxResults=25'.format(api_key, channel_id)
    video_links = []
    url = first_url
    while True:
        inp = urllib.urlopen(url)
        resp = json.load(inp)
        for i in resp['items']:
            if i['id']['kind'] == "youtube#video":
                video_links.append(base_video_url + i['id']['videoId'])
        try:
            next_page_token = resp['nextPageToken']
            url = first_url + '&pageToken={}'.format(next_page_token)
        except:
            break
    return video_links

Solution 3

Short answer:

Here's a library That can help with that.

pip install scrapetube

import scrapetube
videos = scrapetube.get_channel("UC9-y-6csu5WGm29I7JiwpnA")
for video in videos:
    print(video['videoId'])

Long answer:

The module mentioned above was created by me due to a lack of any other solutions. Here's what i tried:

  1. Selenium. It worked but had three big drawbacks: 1. It requires a web browser and driver to be installed. 2. has big CPU and memory requirements. 3. can't handle big channels.
  2. Using youtube-dl. Like this:
import youtube_dl
    youtube_dl_options = {
        'skip_download': True,
        'ignoreerrors': True
    }
    with youtube_dl.YoutubeDL(youtube_dl_options) as ydl:
        videos = ydl.extract_info(f'https://www.youtube.com/channel/{channel_id}/videos')

This also works for small channels, but for bigger ones i would get blocked by youtube for making so many requests in such a short time (because youtube-dl downloads more info for every video in the channel).

So i made the library scrapetube which uses the web API to get all the videos.

Solution 4

Based on the code found here and at some other places, I've written a small script that does this. My script uses v3 of Youtube's API and does not hit against the 500 results limit that Google has set for searches.

The code is available over at GitHub: https://github.com/dsebastien/youtubeChannelVideosFinder

Solution 5

Independent way of doing things. No api, no rate limit.

import requests
username = "marquesbrownlee"
url = "https://www.youtube.com/user/username/videos"
page = requests.get(url).content
data = str(page).split(' ')
item = 'href="/watch?'
vids = [line.replace('href="', 'youtube.com') for line in data if item in line] # list of all videos listed twice
print(vids[0]) # index the latest video

This above code will scrap only limited number of video url's max upto 60. How to grab all the videos url which is present in the channel. Can you please suggest.

This above code snippet will display only the list of all the videos which is listed twice. Not all the video url's in the channel.

Share:
36,246
Johnny
Author by

Johnny

Updated on March 21, 2022

Comments

  • Johnny
    Johnny 9 months

    I want to get all video url's of a specific channel. I think json with python or java would be a good choice. I can get the newest video with the following code, but how can I get ALL video links (>500)?

    import urllib, json
    author = 'Youtube_Username'
    inp = urllib.urlopen(r'http://gdata.youtube.com/feeds/api/videos?max-results=1&alt=json&orderby=published&author=' + author)
    resp = json.load(inp)
    inp.close()
    first = resp['feed']['entry'][0]
    print first['title'] # video title
    print first['link'][0]['href'] #url
    
  • Francesco Frassinelli
    Francesco Frassinelli almost 10 years
    Good answer, but it would be better to use something like "except SpecificError" and not a generic exception: if there are other problems with the json load or with the response parsing, this kind code will hide them.
  • max k.
    max k. almost 10 years
    Good point, if the poster decides to use it then definitely a good idea to do some research and find the specific error
  • Cătălin George Feștilă
    Cătălin George Feștilă almost 9 years
    If you will remove: print len( videos ) then you will got error ... so I think will need to fix that .
  • Muhamed Huseinbašić
    Muhamed Huseinbašić about 8 years
    @CatalinFestila That's not true in my case. I can remove each print (including len(videos)) and it will work. Check other things and try again.
  • Dap
    Dap over 7 years
    I believe that this feature is now deprecated according to this response youtube.com/devicesupport
  • Jabba
    Jabba over 7 years
    Thanks for this. Combined with pafy you can fetch all videos on a channel.
  • Arjun Bhandari
    Arjun Bhandari about 7 years
    this did not work for PyCon 2015 channel or even the example mentioned on the git, it just says channel not found. Am I doing something wrong.
  • Kerem
    Kerem over 4 years
    this is a simple and accurate answer as I cannot find it in the Python API reference.
  • Ebram Shehata
    Ebram Shehata about 4 years
    No longer available.
  • volvox
    volvox over 3 years
    I got quite a lot of errors from using this. Admittedly my channel name appears to have a space in it which caused trouble on the cli, but the tool doesn't take the ID instead, but it searched back through 5 years and found no vidz, and I've got 410 on the channel.
  • dSebastien
    dSebastien about 3 years
    FYI I don't have time to maintain that project, but if anyone is interested, don't hesitate to go and fix it, I'll happily merge any improvements ;-)
  • Gautam Shahi
    Gautam Shahi over 2 years
    @Stian it gives and error HTTPError: HTTP Error 403: Forbidden
  • smcs
    smcs about 2 years
    For Python 3: import urllib.request, change inp = urllib.urlopen(url) to inp = urllib.request.urlopen(url,timeout=1)
  • Admin
    Admin about 2 years
    @smcs it's not working. urllib.error.HTTPError: HTTP Error 403: Forbidden
  • smcs
    smcs about 2 years
    @rtt0012 What URL are you trying?
  • Admin
    Admin about 2 years
    @smcs I copied your code and added my API Key, the rest I didn't changed. I wanted to look up this chanell: youtube.com/c/3blue1brown/videos I run the code by executing get_all_video_in_channel(UCYO_jab_esuFRV4b17AJtAw). The Chanell ID I have found: commentpicker.com/youtube-channel-id.php The Error message reads: urllib.error.HTTPError: HTTP Error 403: Forbidden
  • smcs
    smcs about 2 years
    @rtt0012 It works for me with that site. Are you passing a string to the method, i.e. get_all_video_in_channel("UCYO_jab_esuFRV4b17AJtAw")?
  • Admin
    Admin about 2 years
    @smcs I typed in my code corretly. By copying the text I forgot the quote sings. My API Key has no restrictions. Still get the same error message. I paste the error message as followings...
  • Admin
    Admin about 2 years
    @smcs File "C:\Py38\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "C:\Py38\lib\urllib\request.py", line 531, in open response = meth(req, response) File "C:\Py38\lib\urllib\request.py", line 640, in http_response response = self.parent.error( File "C:\Py38\lib\urllib\request.py", line 569, in error return self._call_chain(*args) File "C:\Py38\lib\urllib\request.py", line 502, in _call_chain result = func(*args) File "C:\Py38\lib\urllib\request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp)
  • Admin
    Admin about 2 years
    @smcs The last line of error message reads: urllib.error.HTTPError: HTTP Error 403: Forbidden
  • smcs
    smcs about 2 years
    @rtt0012 You should open a question on codereview.stackexchange.com
  • Aekanshu
    Aekanshu over 1 year
    Very good solution, Also if someone what to get the video url in place of id you can use print("https://www.youtube.com/watch?v="+str(video['videoId'‌​])) in place of print(video['videoId']) .