Downloading contents of the web page

3,460

If I understood your question, the following script should be what you want:

#!/usr/bin/env python

import urllib
import re
import sys
import os
page = urllib.urlopen("http://www.adobe.com/support/security/")
page = page.read()
fileHandle = open('content', 'w')
links = re.findall(r"<a.*?\s*href=\"(.*?)\".*?>(.*?)</a>", page)
for link in links:
    sys.stdout = fileHandle
    print ('%s' % (link[0]))
sys.stdout = sys.__stdout__
fileHandle.close() 
os.system("grep -i '\/support\/security\/bulletins\/' content 2>/dev/null | head -n 3 | uniq | sed -e 's/^/http:\/\/www.adobe.com/g' > content1")
os.system("wget -i content1")
Share:
3,460

Related videos on Youtube

Naresh
Author by

Naresh

Updated on September 18, 2022

Comments

  • Naresh
    Naresh over 1 year

    I am using TypeScript 1.7.5 and the latest jQuery type definition. The following call to $.getJSON() fails with "error TS2346: Supplied parameters do not match any signature of call target"

    let url: string = api + '/orgs/' + orgname + '/repos?per_page=100';
    $.getJSON(url, function(repos: Repo[]) {
        ...
    });
    

    Repo is defined as:

    export interface Repo {
        name: string;
        stargazers_count: number;
        forks_count: number;
    }
    

    The type definition for getJSON() is:

    getJSON(url: string, success?: (data: any, textStatus: string, jqXHR: JQueryXHR) => any): JQueryXHR;
    

    What am I missing?

    Update

    I found that the error is really coming from chaining a call to error(), which is perfectly legal in regular jQuery. If I remove this call to error() the error goes away. Any idea how I could handle the error from getJSON() in TypeScript?

    interface Repo {
        name: string;
        stargazers_count: number;
        forks_count: number;
    }
    var url = "/echo/json/";
    
    $.getJSON(url, (data: any, textStatus: string, jqXHR: JQueryXHR) => {
        var repos: Repo[] = data;
        //...
        alert(JSON.stringify(repos));
    })
    .error(function() {
        callback([]);
    });
    
  • Radu Rădeanu
    Radu Rădeanu over 10 years
    @Kummi_10 Can you clarify? First you say "each html file in another file", and then you say "ALL PAGES CONTENT IN ONE FILE". This is contradictory.
  • Radu Rădeanu
    Radu Rădeanu over 10 years
    @Kummi_10 You mean one directory? Then use wget -P directory -i content1 . See man wget for more info.
  • Naresh
    Naresh over 8 years
    That's much better than what I had, but it is still giving the same error :-(
  • TSV
    TSV over 8 years
    I've created jsfiddle (jsfiddle.net/q2bnhden) it compiles and works like a charm for me.
  • Naresh
    Naresh over 8 years
    Thanks, @TSV. Still no success. I am using tsc through an npm script. Must be something wrong with my setup. Still checking...
  • Naresh
    Naresh over 8 years
    Found it! The error is really coming from chaining a call to error(). Please see my update. Can you think of a solution?
  • Naresh
    Naresh over 8 years
    Awesome! Thanks for all your help, @TSV.