What are the alternatives now that the Google web search API has been deprecated?

198,589

Solution 1

You could just send them through like a browser does, and then parse the html, that is what I have always done, even for things like Youtube.

Solution 2

Yes, Google Custom Search has now replaced the old Search API, but you can still use Google Custom Search to search the entire web, although the steps are not obvious from the Custom Search setup.

To create a Google Custom Search engine that searches the entire web:

  1. From the Google Custom Search homepage ( http://www.google.com/cse/ ), click Create a Custom Search Engine.
  2. Type a name and description for your search engine.
  3. Under Define your search engine, in the Sites to Search box, enter at least one valid URL (For now, just put www.anyurl.com to get past this screen. More on this later ).
  4. Select the CSE edition you want and accept the Terms of Service, then click Next. Select the layout option you want, and then click Next.
  5. Click any of the links under the Next steps section to navigate to your Control panel.
  6. In the left-hand menu, under Control Panel, click Basics.
  7. In the Search Preferences section, select Search the entire web but emphasize included sites.
  8. Click Save Changes.
  9. In the left-hand menu, under Control Panel, click Sites.
  10. Delete the site you entered during the initial setup process.

Now your custom search engine will search the entire web.

Pricing

  • Google Custom Search gives you 100 queries per day for free.
  • After that you pay $5 per 1000 queries.
  • There is a maximum of 10,000 queries per day.

Source: https://developers.google.com/custom-search/json-api/v1/overview#Pricing


  • The search quality is much lower than normal Google search (no synonyms, "intelligence" etc.)
  • It seems that Google is even planning to shut down this service completely.

Solution 3

Google Custom Search (as advocated in the top rated answers) works well, but is very expensive, compared to its competitors (below) or compared to other Google API's. It has a small free tier (100 queries/day) and a very high price of $5 per 1000 query.

They offer the option to upgrade to Site Search, which has slightly better prices, but that is meant for searching one site (your own), so it is really something quite different - not an upgrade.

The main alternatives seem to be:

Bing Search API
https://datamarket.azure.com/dataset/5BA839F1-12CE-4CCE-BF57-A49D98D29A44
Which has a free tier of 5000q/month, and prices starting at 5 query per penny, and no hard limit.

UPDATE: At the end of 2016 this API was shutdown in favour of its Azure counterpart "Cognitive Services Bing Search API":
https://azure.microsoft.com/en-us/services/cognitive-services/search/

See here for a pricing chart, which starts at US$3/m for 1,000 transactions. Unless I'm missing something it is quite expensive.

Yahoo BOSS Search API
UPDATE: Was discontinued on March 31, 2016. http://developer.yahoo.com/boss/search/
With prices starting at about 12 queries/penny for whole web searches.

And some I haven't heard of before:

http://www.gigablast.com/searchfeed.html

http://www.faroo.com/hp/api/api.html

http://www.commoncrawl.org/

http://www.entireweb.com/search_api/implementation/
[discontinued - as pointed out below]

There is a bit of discussion of some of these on this SO post.
[got closed for being off-topic and is now gone]

Solution 4

Here is an option at the bottom of the Custom Search Control Panel: "Sites to search", you can choose "Search the entire web but emphasize included sites"

Custom Search Control Panel - Sites to search

Solution 5

Faroo has a free Web Search API

Share:
198,589
Dan
Author by

Dan

Updated on July 08, 2022

Comments

  • Dan
    Dan almost 2 years

    Google Web Search API has been deprecated and replaced with Custom Search API (see http://code.google.com/apis/websearch/).

    I wanted to search the whole web but it looks like with the new API only custom sites can be searched.

    Is there a way to search the whole web programmatically? I was able to query the old API using JSON from a Java program.

  • Dan
    Dan over 13 years
    I really need a proper API call as I'm intending making many calls.
  • Steven A. Lowe
    Steven A. Lowe over 13 years
    i'm told that googles terms of service forbid spidering...
  • ændrük
    ændrük over 13 years
    From the TOS: "You specifically agree not to access (or attempt to access) any of the Services through any automated means (including use of scripts or web crawlers)..."
  • Peter Kazazes
    Peter Kazazes over 12 years
    Shabby, wouldn't on any large scale. Maybe if the program is for personal use...
  • UpTheCreek
    UpTheCreek about 12 years
    And it's not free.... "$5 per 1000 queries"... very much not free!
  • Mazatec
    Mazatec almost 12 years
    Thanks for this. Hopefully this is a valid procedure and not a loophole waiting to be plugged by Google!
  • WhyNotHugo
    WhyNotHugo almost 12 years
    @Zimm3r Read the tooltip on the "downvote" button; that's why. Also, because the suggestion isn't allowed by google's TOS.
  • Farzher
    Farzher over 11 years
    Confirmed to be working. Results are slightly different than a live search though. Any ideas on that? Bing's API has the same problem.
  • Zimm3r
    Zimm3r over 11 years
    @Hugo the answer is useful because it does what was asked and I AM STILL getting downvoted for an answer that was accepted, that works, that is useful, and it is the askser's responsibility to decide on google TOS not mine.
  • WhyNotHugo
    WhyNotHugo over 11 years
    @Zimm3r If the answer is useful or not is subjective. I did not find it useful having the same question as the op, since it's neither a clean solution, or something that the TOS allows.
  • Zimm3r
    Zimm3r over 11 years
    @Hugo no it isn't subjective or at least not in such a degree you suggest, it is useful if it answers the question in a viable way, TOS violations are something to be weighted but not something that makes something wholly useless.
  • WhyNotHugo
    WhyNotHugo over 11 years
    "Violate the terms of service with a service provider" is never a good advice. Parsing webpages is something that breaks from one day to the next without warning, this is awful advice - that's the reason it was downvotes more that it was upvoted.
  • Zimm3r
    Zimm3r over 11 years
    I don't recall telling them to break the TOS, I gave them a valid answer that was accepted as the best and it is their choice to do what they want with that information.
  • spamguy
    spamguy over 11 years
    Thank you! This is possibly the only answer on the Internet that addressed my question. It's mind boggling why Google would end direct API support for their core service.
  • jimbo2087
    jimbo2087 over 11 years
    Yes it breaks the terms of service but personally I wouldn't worry about that. Google can handle a little bit of scraping, after all they have made a fortune scraping other peoples sites.
  • nawara
    nawara about 11 years
    but how to use it with json ?
  • Praesagus
    Praesagus about 11 years
    The results are a little different because of personalized and local search results.
  • Guillaume Lebourgeois
    Guillaume Lebourgeois over 10 years
    It has a limited index, refreshed about once a year. And it is finally quite expensive, as you have to plug into Amazon S3.
  • Server Overflow
    Server Overflow over 10 years
    Come on people. Don't be so naive. Google cannot force that ToS down your throat. In order to violate a ToS you must first agree with it (in writing, or by clicking a button like 'Yes, I accept the terms'). Think at this: I put a ToS on my web page that every person that visits that page has to give me $10000. Can I enforce this ToS on my visitors? Will the have to may me immediately.
  • Deepanshu Goyal
    Deepanshu Goyal over 10 years
    welll thats great, but the thing I hesistate at is, IS IT PAID ??
  • Rob W
    Rob W over 10 years
    @Deepanshu You only get 100 queries per day for free (docs).
  • WGH
    WGH over 10 years
    @Altar they can still block your IP ;) Ever seen a captcha in Google search? Some people have.
  • WGH
    WGH over 10 years
    @Altar This's simply untrue. If your program is running on dedicated server, it certainly has a static IP. Besides, having dynamic address still means that you have to reconnect manually to obtain a new one.
  • MFARID
    MFARID about 10 years
    This is why Google claims that the search results are different support.google.com/customsearch/answer/141877?hl=en Mainly: Using specified sites (does not apply here), no social or personalized or real time results
  • Asad Saeeduddin
    Asad Saeeduddin about 10 years
    @Altar Just saying "come on" doesn't magically dispel all barriers. You have to stay within the limits of the law.
  • Server Overflow
    Server Overflow about 10 years
    @WGH-most router today have an option to retrieve a new IP at midnight.
  • Chetan Pulate
    Chetan Pulate almost 10 years
    Any chance you can update this question to reflect the new layout? Can't seem to find half the stuff in your question.
  • Bangkokian
    Bangkokian almost 10 years
    Rippo -- I haven't been back in a while... but even if they've changed the layout the methodology is probably still sound: Create a search engine to search a specific site PLUS the entire web. Then delete that specific site. What you're left with should be a generic web search. They may have closed the loophole afterall... but if it's still do-able, this general advice may help. Good luck.
  • Bangkokian
    Bangkokian almost 10 years
    And.. if they have closed the loophole and now force you to search at least 'one' site. You might try creating a URL/site with zero content. Just a blank index.html page. The results should then be the same as a generic web search. 'Just a thought...
  • afro360
    afro360 over 9 years
    Their results seam limited but a good starting point.
  • nanofarad
    nanofarad over 9 years
    This answer is now obsolete as the three years are up and 2014/09/29 has passed.
  • Dejell
    Dejell over 9 years
    CustomeSearchAPI is not in all the websites - it's for the user websites
  • Dejell
    Dejell over 9 years
    I tried it but it doesn't work now. I asked to look in the entire web for suunto ambit watch, but I got no results (I searched in the public URL that I got)
  • Dejell
    Dejell over 9 years
    does it still work for you?
  • Dejell
    Dejell over 9 years
    I didn't get it. Does it work for you?
  • Admin
    Admin over 9 years
    Yep, it still works.
  • Admin
    Admin over 9 years
    Note this only works for the free version support.google.com/customsearch/answer/2631040
  • mgutt
    mgutt over 9 years
    @MFARID It does not only miss social/live/etc data. It does not allow a search based on synonyms and it is completely missing intelligence. e.g. "john doe northpole" will not return a result if "john doe" is now living at the "southpole" and has changed this information on his website or removed the word "northpole" or he or you made a typo like "nortpole". In my eyes the custom search is nearly useless.
  • mgutt
    mgutt over 9 years
    @Altar You are right, but finally you could infringe a law like en.wikipedia.org/wiki/Sui_generis_database_right or in Germany like dejure.org/gesetze/StGB/303b.html So it depends on your country and laws and of course it depends on the laws of the country where google is located at. But finally its much easier for google to ban ips. And of course you could reconnect and obtain a new ip as often as you want, but it could be possible that google uses geo databases to block your region much more (e.g. if you search 10x times in 5 minutes) often than others.
  • Bryan Larsen
    Bryan Larsen about 9 years
    No, you can't enforce a ToS against random web surfers. However, creating a program to scrape a web page shows clear intent and the skill required to do so would put you in a higher class of "reasonable person". You might not lose a criminal lawsuit but probably would lose a civil lawsuit. IANAL. Ref: Aaron Swartz.
  • Bryan Larsen
    Bryan Larsen almost 9 years
    WARNING: we did development using the free version, but to upgrade to the paid version (to do more than 100 searches), google forces you to turn off the "search the entire web but emphasize included sites"
  • Bryan Larsen
    Bryan Larsen almost 9 years
    Google forces you to turn that option off when you upgrade to paid search. And free has a limit of 100 searches.
  • Pacerier
    Pacerier almost 9 years
    @BryanLarsen, It's still possible to use the old API that doesn't have the paltry 100/day limit right?
  • Pacerier
    Pacerier almost 9 years
    @Bangkokian, Why is there a hard limit of 10k queries/day? Assuming you can pay, How do you get above 10k queries/day then? Do you create multiple keys?
  • Pacerier
    Pacerier almost 9 years
    @Jack, Not heard of this before. Where do they get their search results from?
  • Pacerier
    Pacerier almost 9 years
    @Yishu, Why does the page https://support.google.com/customsearch/answer/141877?hl=en states "You cannot configure Google Site Search to search the entire web"?
  • Yishu Fang
    Yishu Fang almost 9 years
    @Pacerier, I have no idea about it. Maybe the policy have changed?
  • Sherwin Flight
    Sherwin Flight over 8 years
    -1 @Zimm3r, you said you provided a "valid answer", but I disagree. I don't consider it a valid answer when it requires the use of a web service, while specifically breaking their T.O.S. Your solution cannot be used without violating Google's Terms of Use, therefore is not really a valid answer in my opinion. It's like someone telling you they need money for groceries, and you suggesting they rob the bank. Sure, technically it is an option, but not one that is likely to work.
  • Uncaught Exception
    Uncaught Exception over 8 years
    Possible deal breaker for Faroo is that your API key is restricted to the IP address you specify during registration.
  • thdoan
    thdoan about 8 years
    I'm not sure how it was before, but now you have to set up a billing account regardless of whether you use the free or paid tier. Bummer.
  • mvark
    mvark about 8 years
    Bing Search API version 5 now allows up to 1,000 transactions per month across all Bing Search APIs (Web, Images, Video, News Search) - microsoft.com/cognitive-services/en-us/pricing . I put together some samples - mvark.blogspot.in/2016/06/…
  • Wessam El Mahdy
    Wessam El Mahdy almost 8 years
    entireweb.com has discontinued the service as seen here entireweb.com/services
  • Wessam El Mahdy
    Wessam El Mahdy almost 8 years
    entireweb.com has discontinued the service as seen here entireweb.com/services
  • Pacerier
    Pacerier almost 8 years
    @GuillaumeLebourgeois, Expensive? I don't think that's true. It's a nonprofit. The entire 102 TB of data is free for download.
  • gilad905
    gilad905 over 7 years
    on Dec 15, 2016 Bing Web Search API will move under Cognitive Services by Azure Marketplace (azure.microsoft.com/en-us/services/cognitive-services/searc‌​h), which require a phone + credit card verification for a subscription (even a free one).
  • Jake 1986
    Jake 1986 over 7 years
    This still works.
  • Paul Whelan
    Paul Whelan over 7 years
    Are these guys still operational? I've requested API keys and heard nothing.
  • Gajus
    Gajus over 7 years
    "On April 1, 2017, Google will discontinue sales of the Google Site Search. All new purchases and renewals must take place before this date. The product will be completely shut down by April 1, 2018."
  • Titou
    Titou about 7 years
    The usefulness of the answer does not mean 'always applicable'. The Google Terms of Service could change - they have already after all. If you need a small amount of files, you are not hurting big G.
  • Dmitri Zaitsev
    Dmitri Zaitsev about 7 years
    From Bing API: "DataMarket and Data Services are being retired and will stop accepting new orders after 12/31/2016. Existing subscriptions will be retired and cancelled starting 3/31/2017. Please reach out to your service provider for options if you want to continue service."
  • Tom
    Tom about 7 years
    Thanks for pointing out the change - I've updated answer accordingly.
  • nurettin
    nurettin almost 7 years
    Google custom search for the entire web works, but it won't give you more than 100 results per search query even if you are a paying customer.
  • Tina Lee
    Tina Lee almost 7 years
    The Google Custom Search homepage ( google.com/cse ) always returns 500 err... Is anyone facing the same problem?
  • ffeast
    ffeast over 6 years
    It's worth adding that besides such a low limit it also permits only 10 results per query
  • Jeyekomon
    Jeyekomon over 6 years
    Scraping the webpage has these disadvantages: (1) Google doesn't like it - you might face IP ban, captchas and other obstacles. (2) The HTML code of the webpage changes frequently - you will end up fixing your code again and again in your long-term projects. (3) The API can possibly give you more metadata about the search results than the webpage. I downvoted this answer. But I'm not any kind of law nazi. This approach is simply not good for the reasons above.
  • rustyx
    rustyx over 6 years
    @ændrük that part about automated means is gone from their TOS since March 2012.
  • tripleee
    tripleee about 6 years
    The cost is for connecting to AWS where you can access this. If you are a student, you are eligible for their free tier, but there could still be transfer costs etc; and if you are not in the free tier, there are running costs.
  • Andre Figueiredo
    Andre Figueiredo over 5 years
    @rustyx it still break the terms: "don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide."
  • Jérémie
    Jérémie over 5 years
    BTW: The reason why Google is so adamant about preventing scraping is not for the reasons you think: It is not because it might cost bandwidth—which is cheap. It is because one of Google's most valuable assets, is that its query log is one of the most potent insights into the collective consciousness. Being polluted by mechanized queries would make it worthless, so they are investing all their efforts to dithering scraping done in a way to pollute that data set.
  • Jack
    Jack over 5 years
    Looks like common crawl is updated monthly now
  • Nathan B
    Nathan B over 4 years
    After we create the custom search engine, how do we invoke the API ?
  • Justin Skiles
    Justin Skiles over 4 years
    @TinaLee the correct URL is cse.google.com/cse
  • Jivan
    Jivan almost 4 years
    @AndreFigueiredo don't [...] access [our Services] using a method other than the interface and the instructions that we provide => a web crawler is using the interface and the instructions that they provide. It just does so by automated means instead of manually, so a web crawler is absolutely compliant with these ToS (at least, with this sentence you quoted).
  • Andre Figueiredo
    Andre Figueiredo almost 4 years
    @Jivan that's a fair point, I'm not savvy about laws and web crawlers, but my guess is that bots and raw HTTP requests would not be compliant to an accepted interface, versus Selenium for example :P. And, instructions they provide to access their services would not include automated requests - scrapping. Correct me if I'm wrong.
  • Andre Figueiredo
    Andre Figueiredo almost 4 years
    That said, they have changed their entire TOS, new says: we reasonably [??] believe that your conduct causes harm or liability [??] to a user, third party, or Google — for example, by [...] scraping content that doesn’t belong to you. I honestly don't know what it exactly means to our case here.. We are doing no harm :P
  • Artemis
    Artemis almost 4 years
    Page has a "Coming Soon" banner now...
  • Ezekiel Victor
    Ezekiel Victor over 3 years
    @mopsyd while you are not compelled to "agree with" (whatever that means) the ToS, you are compelled to comply insofar as Google as a private entity can choose not to provide service to you, and obviously they are likely to do so if you are violating their ToS. Further, they will be able to recoup damages in a civil setting. "Opting out" doesn't make sense; no one is forcing you to use their services. And declaring that they can "suck it" definitely doesn't do anything for you. 😂
  • Ezekiel Victor
    Ezekiel Victor over 3 years
    @jungle_mole Google is not using your services so your hypothetical terms to them don't matter. So they are not breaking your terms. And even if they somehow were, you still wouldn't be justified in breaking theirs; that's not how contracts work. It doesn't really matter anyway because you are using their services in this case and you definitely have no particular right guaranteeing you access since as a private entity they have no obligation to serve you in the first place.
  • mopsyd
    mopsyd over 3 years
    @Ezikiel being able and doing are entirely different concepts. If you want to take the pedantic stance you can say someone has a rule somewhere about the thing. You can also tak a practical stance that weighs the risk of a company retaliating or cutting off service, the likelihood that they care enough about a trivial infraction to waste time and money on an agressive civil action (they don’t, unless your abuse is egrarious), and decide whether or not tangental concerns likea ToS matters to your use case. I am certain that to one prone to pedantry and condecending emojis it probably does.
  • jungle_mole
    jungle_mole over 3 years
    @EzekielVictor as a fact, they are using my services as "targetable ad-watcher" or human for of clicker bot. We have barter: from my side time and cognitive function, from their side -- search window on my desktop. But it's them, who setting rules.. Nope, I have my guesses too. Since they are closed for discussion, I'll just do my way and if they don't agree, they are free to refuse to continue acquiring my services, as you said: no obligations. Anyway, it's valid answer. If it was LawOverflow here, answer could be considered arguable.(sry i would be not justified from whose point of view)
  • Ezekiel Victor
    Ezekiel Victor over 3 years
    @jungle_mole, when you refer to "free to refuse to continue acquiring" your services, you're referring to being IP banned. This thread has jumped the shark.
  • jungle_mole
    jungle_mole about 3 years
    @EzekielVictor yep, that's what i'm saying: any sanctions they see fit and able to impose. all the more so, they are not going astray from this "warpath" since forever, why succumb? when counterparty feels ok with their moral rights to take their advantage and utilize me, my hands are untied to make use of our reciprocity in any manner i see fit. they don't disclose their ways, neither they negotiate them, so why should i? every party seeks maximum benefit, but one with all its might tries to confine the other. preemptively, mind you. what's left is try and exploit the usurper, it won't starve
  • Kyle
    Kyle almost 3 years
    Looks like Bing's moved their service again - now it's on the Azure Marketplace docs.microsoft.com/en-us/bing/search-apis/bing-web-search/…
  • Big Ian
    Big Ian over 2 years
    Now redirects to seekstorm.com which is a paid for service
  • x-ray
    x-ray over 2 years
    At least currently (february 2022) the data can be downloaded from S3 for free. HTTP-links can be found on the commoncrawl website.