Solr and web site indexing to create a site search

11,748

Solution 1

Solr is only for indexing and searching text, it does not have a crawler since it's out the project's scope.

However take a look at Nutch, which is a crawler and not too hard to setup initially.

Nutch and Solr can be integrated if you need some Solr-specific feature to search the index.

Solution 2

$ bin/solr create -c corename
$ bin/post -c corename https://siteurl.com -recursive 2 -delay 1

This would do a basic index of the site but it would not be the best. If you want simple then there it is. It can be done.

I think this only works on solr 5+.

Solution 3

Two other options you might want to look at are Crawl Anywhere and Heritrix

Share:
11,748
feniix
Author by

feniix

I am an Argentinian SysAdmin living in Buenos Aires

Updated on June 17, 2022

Comments

  • feniix
    feniix almost 2 years

    I was trying to build a 'site search' on a simple http site.

    I have a site, lets call it www.mycompany.com, that is pure html.

    Is there an easy way to use solr to index the entire site to build a full text search using solr as the engine?

    I googled for a bit and could not find anything specific of the type: Do A Do B ... profit!

    Let me also know if I am a bit off with what is solr for :P

    Thanks in advance.