Python urllib2 Progress Hook
Solution 1
Here's a fully working example that builds on Anurag's approach of chunking in a response. My version allows you to set the the chunk size, and attach an arbitrary reporting function:
import urllib2, sys
def chunk_report(bytes_so_far, chunk_size, total_size):
percent = float(bytes_so_far) / total_size
percent = round(percent*100, 2)
sys.stdout.write("Downloaded %d of %d bytes (%0.2f%%)\r" %
(bytes_so_far, total_size, percent))
if bytes_so_far >= total_size:
sys.stdout.write('\n')
def chunk_read(response, chunk_size=8192, report_hook=None):
total_size = response.info().getheader('Content-Length').strip()
total_size = int(total_size)
bytes_so_far = 0
while 1:
chunk = response.read(chunk_size)
bytes_so_far += len(chunk)
if not chunk:
break
if report_hook:
report_hook(bytes_so_far, chunk_size, total_size)
return bytes_so_far
if __name__ == '__main__':
response = urllib2.urlopen('http://www.ebay.com');
chunk_read(response, report_hook=chunk_report)
Solution 2
Why not just read data in chunks and do whatever you want to do in between, e.g. run in a thread, hook into a UI, etc etc
import urllib2
urlfile = urllib2.urlopen("http://www.google.com")
data_list = []
chunk = 4096
while 1:
data = urlfile.read(chunk)
if not data:
print "done."
break
data_list.append(data)
print "Read %s bytes"%len(data)
output:
Read 4096 bytes
Read 3113 bytes
done.
Solution 3
urlgrabber has built-in support for progress notification.
speedplane
Docket Alarm indexes hundreds of millions of lawsuits, analyzes what happens in them, and predicts what will happen in future cases. We package this intelligence up into a SaaS service and sell it to lawyers at top law firms. To analyze the law, we deal with lots of interesting NLP and unstructured data problems, and to tackle them, we use some of the latest machine-learning tools. We're always looking for skilled developers with a passion in the intersection of law and technology. If that's you, please get in touch.
Updated on June 17, 2020Comments
-
speedplane almost 4 years
I am trying to create a download progress bar in python using the urllib2 http client. I've looked through the API (and on google) and it seems that urllib2 does not allow you to register progress hooks. However the older deprecated urllib does have this functionality.
Does anyone know how to create a progress bar or reporting hook using urllib2? Or are there some other hacks to get similar functionality?