What is the ideal self hosted search engine?
Solution 1
Check out Lucene
Written in Java also available for the .NET framework
Here's a CodeProject article that explains how it works and how it's used. http://www.codeproject.com/KB/library/IntroducingLucene.aspx
Solution 2
SearchBlox which is based on Lucene may be able to meet your needs. It is free and comes with a crawler.
Solution 3
I've used Sphider before and have been quite impressed.
Related videos on Youtube
nedruod
Updated on September 17, 2022Comments
-
nedruod over 1 year
I have an internal (intranet) site that is comprised of several blogs and forums, hundreds of static pages, lots of PDF files and several other document types. Its been glued together loosely over the last couple of years and now its my job to maintain it.
I'm looking for a search engine that I can host myself that ideally:
Allows for searching the Blog / Forum databases directly if given the database information and tables to search.
Handles most text documents (PDF/DOC/ODF)
Is open source, or allows access to the source code once purchased
It doesn't matter to me what language or platform it is written in. Normally, I'd just use Google site search, but that's not an option for an intranet.
-
nedruod almost 14 yearsI looked at it. I really want something where I can have access to the code. Its also a little heavy in the budget department.
-
digit1001 over 13 yearsIn addition to the Google Search Appliance, there is a similar product by "Thunderstone" that's competitive. I've used both in the past. While it may not work for you, thought I'd post for others who stumble on the question.
-
cweiske over 9 yearsfree for 25k urls, which is not much for an intranet
-
Daniel about 7 years@cweiske Check out Ambar, it's based on ElasticSearch and free. github.com/RD17/ambar