How can I scrape specific data from a website

12,416

Solution 1

You can use iMacros to automate this task. It is a tool to automate repetative tasks, it can be used to extract data from a site and save it as a CSV file.

As x in example.com/x follows a pattern (i.e. ascending), you can easily program (even without programming exeprience) it to visit each page and extract data from it.

Read more at iMacros Tutorial.

Alternatively you can also try ScraperWiki (some programming experience required).

Solution 2

You can paste this code in a module:

Option Explicit
Sub get_data()
Dim result As String
Dim myURL As String
Dim winHttpReq As Object
Set winHttpReq = CreateObject("WinHttp.WinHttpRequest.5.1")
myURL = "example.com/x format"

winHttpReq.Open "GET", myURL, False
winHttpReq.Send
result = winHttpReq.responseText
Application.ScreenUpdating = True

Range("Sheet1!A1").value = result
End sub

You can simply parse the long string in cell A1 to extract the data you need.

Share:
12,416

Related videos on Youtube

Stoney
Author by

Stoney

Updated on September 18, 2022

Comments

  • Stoney
    Stoney over 1 year

    I'm trying to scrape data from a website for research.

    The urls are nicely organized in an example.com/x format, with x as an ascending number and all of the pages are structured in the same way. I just need to grab certain headings and a few numbers which are always in the same locations. I'll then need to get this data into structured form for analysis in Excel.

    I have used wget before to download pages, but I can't figure out how to grab specific lines of text.

    Excel has a feature to grab data from the web (Data->From Web) but from what I can see it only allows me to download tables. Unfortunately, the data I need is not in tables.

    • Stoney
      Stoney over 11 years
      In the end, I created a .txt file with a list of all the urls I needed and had wget download all the pages in that file. I then used iMacros as suggested by @Lamb to extract data locally. I found the trial of the fulll version easiest to start out with. Some useful beginner features are not available in the firefox plugin gui, even though the same code will work.
  • Stoney
    Stoney over 11 years
    Thank you! iMacros appears to be the solution I need. After asking the original question, I had found ScraperWiki. It appears to be a powerful tool, but is a bit complicated programmingwise for me.
  • Baarn
    Baarn over 11 years
    It would help if you explain what the code does.