What is the ideal self hosted search engine?

6,582

Solution 1

Check out Lucene

Written in Java also available for the .NET framework

Here's a CodeProject article that explains how it works and how it's used. http://www.codeproject.com/KB/library/IntroducingLucene.aspx

Solution 2

SearchBlox which is based on Lucene may be able to meet your needs. It is free and comes with a crawler.

Solution 3

I've used Sphider before and have been quite impressed.

Share:
6,582

Related videos on Youtube

nedruod
Author by

nedruod

Updated on September 17, 2022

Comments

  • nedruod
    nedruod over 1 year

    I have an internal (intranet) site that is comprised of several blogs and forums, hundreds of static pages, lots of PDF files and several other document types. Its been glued together loosely over the last couple of years and now its my job to maintain it.

    I'm looking for a search engine that I can host myself that ideally:

    1. Allows for searching the Blog / Forum databases directly if given the database information and tables to search.

    2. Handles most text documents (PDF/DOC/ODF)

    3. Is open source, or allows access to the source code once purchased

    It doesn't matter to me what language or platform it is written in. Normally, I'd just use Google site search, but that's not an option for an intranet.

  • nedruod
    nedruod almost 14 years
    I looked at it. I really want something where I can have access to the code. Its also a little heavy in the budget department.
  • digit1001
    digit1001 over 13 years
    In addition to the Google Search Appliance, there is a similar product by "Thunderstone" that's competitive. I've used both in the past. While it may not work for you, thought I'd post for others who stumble on the question.
  • cweiske
    cweiske over 9 years
    free for 25k urls, which is not much for an intranet
  • Daniel
    Daniel about 7 years
    @cweiske Check out Ambar, it's based on ElasticSearch and free. github.com/RD17/ambar