Solr vs. ElasticSearch

267,096

Solution 1

Update

Now that the question scope has been corrected, I might add something in this regard as well:

There are many comparisons between Apache Solr and ElasticSearch available, so I'll reference those I found most useful myself, i.e. covering the most important aspects:

  • Bob Yoplait already linked kimchy's answer to ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage?, which summarizes the reasons why he went ahead and created ElasticSearch, which in his opinion provides a much superior distributed model and ease of use in comparison to Solr.

  • Ryan Sonnek's Realtime Search: Solr vs Elasticsearch provides an insightful analysis/comparison and explains why he switched from Solr to ElasticSeach, despite being a happy Solr user already - he summarizes this as follows:

    Solr may be the weapon of choice when building standard search applications, but Elasticsearch takes it to the next level with an architecture for creating modern realtime search applications. Percolation is an exciting and innovative feature that singlehandedly blows Solr right out of the water. Elasticsearch is scalable, speedy and a dream to integrate with. Adios Solr, it was nice knowing you. [emphasis mine]

  • The Wikipedia article on ElasticSearch quotes a comparison from the reputed German iX magazine, listing advantages and disadvantages, which pretty much summarize what has been said above already:

    Advantages:

    • ElasticSearch is distributed. No separate project required. Replicas are near real-time too, which is called "Push replication".
    • ElasticSearch fully supports the near real-time search of Apache Lucene.
    • Handling multitenancy is not a special configuration, where with Solr a more advanced setup is necessary.
    • ElasticSearch introduces the concept of the Gateway, which makes full backups easier.

    Disadvantages:

    • Only one main developer [not applicable anymore according to the current elasticsearch GitHub organization, besides having a pretty active committer base in the first place]
    • No autowarming feature [not applicable anymore according to the new Index Warmup API]

Initial Answer

They are completely different technologies addressing completely different use cases, thus cannot be compared at all in any meaningful way:

  • Apache Solr - Apache Solr offers Lucene's capabilities in an easy to use, fast search server with additional features like faceting, scalability and much more

  • Amazon ElastiCache - Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud.

    • Please note that Amazon ElastiCache is protocol-compliant with Memcached, a widely adopted memory object caching system, so code, applications, and popular tools that you use today with existing Memcached environments will work seamlessly with the service (see Memcached for details).

[emphasis mine]

Maybe this has been confused with the following two related technologies one way or another:

  • ElasticSearch - It is an Open Source (Apache 2), Distributed, RESTful, Search Engine built on top of Apache Lucene.

  • Amazon CloudSearch - Amazon CloudSearch is a fully-managed search service in the cloud that allows customers to easily integrate fast and highly scalable search functionality into their applications.

The Solr and ElasticSearch offerings sound strikingly similar at first sight, and both use the same backend search engine, namely Apache Lucene.

While Solr is older, quite versatile and mature and widely used accordingly, ElasticSearch has been developed specifically to address Solr shortcomings with scalability requirements in modern cloud environments, which are hard(er) to address with Solr.

As such it would probably be most useful to compare ElasticSearch with the recently introduced Amazon CloudSearch (see the introductory post Start Searching in One Hour for Less Than $100 / Month), because both claim to cover the same use cases in principle.

Solution 2

I see some of the above answers are now a bit out of date. From my perspective, and I work with both Solr(Cloud and non-Cloud) and ElasticSearch on a daily basis, here are some interesting differences:

  • Community: Solr has a bigger, more mature user, dev, and contributor community. ES has a smaller, but active community of users and a growing community of contributors
  • Maturity: Solr is more mature, but ES has grown rapidly and I consider it stable
  • Performance: hard to judge. I/we have not done direct performance benchmarks. A person at LinkedIn did compare Solr vs. ES vs. Sensei once, but the initial results should be ignored because they used non-expert setup for both Solr and ES.
  • Design: People love Solr. The Java API is somewhat verbose, but people like how it's put together. Solr code is unfortunately not always very pretty. Also, ES has sharding, real-time replication, document and routing built-in. While some of this exists in Solr, too, it feels a bit like an after-thought.
  • Support: there are companies providing tech and consulting support for both Solr and ElasticSearch. I think the only company that provides support for both is Sematext (disclosure: I'm Sematext founder)
  • Scalability: both can be scaled to very large clusters. ES is easier to scale than pre-Solr 4.0 version of Solr, but with Solr 4.0 that's no longer the case.

For more thorough coverage of Solr vs. ElasticSearch topic have a look at https://sematext.com/blog/solr-vs-elasticsearch-part-1-overview/ . This is the first post in the series of posts from Sematext doing direct and neutral Solr vs. ElasticSearch comparison. Disclosure: I work at Sematext.

Solution 3

I see that a lot of folks here have answered this ElasticSearch vs Solr question in terms of features and functionality but I don't see much discussion here (or elsewhere) regarding how they compare in terms of performance.

That is why I decided to conduct my own investigation. I took an already coded heterogenous data source micro-service that already used Solr for term search. I switched out Solr for ElasticSearch then I ran both versions on AWS with an already coded load test application and captured the performance metrics for subsequent analysis.

Here is what I found. ElasticSearch had 13% higher throughput when it came to indexing documents but Solr was ten times faster. When it came to querying for documents, Solr had five times more throughput and was five times faster than ElasticSearch.

Solution 4

Since the long history of Apache Solr, I think one strength of the Solr is its ecosystem. There are many Solr plugins for different types of data and purposes.

solr stack

Search platform in the following layers from bottom to top:

  • Data
    • Purpose: Represent various data types and sources
  • Document building
    • Purpose: Build document information for indexing
  • Indexing and searching
    • Purpose: Build and query a document index
  • Logic enhancement
    • Purpose: Additional logic for processing search queries and results
  • Search platform service
    • Purpose: Add additional functionalities of search engine core to provide a service platform.
  • UI application
    • Purpose: End-user search interface or applications

Reference article : Enterprise search

Solution 5

I have created a table of major differences between elasticsearch and Solr and splunk, you can use it as 2016 update: enter image description here

Share:
267,096
Ben ODay
Author by

Ben ODay

Software Consultant and owner of Initek Consulting a firm that specializes in vehicle telematics, Iot and cloud based integration solutions.

Updated on March 13, 2020

Comments

  • Ben ODay
    Ben ODay about 4 years

    What are the core architectural differences between these technologies?

    Also, what use cases are generally more appropriate for each?

  • Steffen Opel
    Steffen Opel about 12 years
    @boday: Sounds like they might be using Lucene based elasticsearch indeed.
  • Otis Gospodnetic
    Otis Gospodnetic over 11 years
    @Rubytastic - you may want to comment on the post to get the author's attention and get some memory consumption coverage. But the blog.sematext.com/2012/05/17/elasticsearch-cache-usage post may already have what you are looking for.
  • javanna
    javanna over 11 years
    Now that there's a company behind elasticsearch the one main developer disadvantage should be gone.
  • unludo
    unludo over 11 years
    It seems autowarming is addressed by ElasticSearch now. See github.com/elasticsearch/elasticsearch/issues/1913
  • Steffen Opel
    Steffen Opel over 11 years
    @unludo - I've adjusted the answer regarding the new index warmup API, thanks for pointing this out.
  • Chip
    Chip over 10 years
  • MattMcKnight
    MattMcKnight over 10 years
    All of the advantages of ElasticSearch listed in the iX magazine section are now also wrong. 1) SolrCloud is no longer a separate project. Indeed, Solr and Lucene are now part of the same project. 2) Solr supports NRT. 3) Solr handles multiple collections in a single cluster 4) Solr also has added a replication feature which makes backups easier.
  • Mark Giaconia
    Mark Giaconia almost 10 years
    Don't forget about the aggregations ElasticSearch provides for those requiring OLAP like functionality. Solr cloud has only limited faceting. And if you need alerts on aggregations ES percolation delivers.
  • Mark Giaconia
    Mark Giaconia almost 10 years
    Even though SOLR Cloud and SOLR are not separate projects anymore, they still do not have the same capabilities (specifically Pivot faceting is not supported in Cloud as of now). I also don't think it's fair to compare SOLR with ES, we should be comparing SOLR Cloud with ES.
  • user
    user over 9 years
    Thank you for sharing a well written first hand opinion & blog posts. It's been 2 years since this post. I think the community would benefit if you could share more insights you gathered along the way. Something that can help people decide which amongst solr/elasticSearch is better for them.
  • KingOfHypocrites
    KingOfHypocrites almost 9 years
    I would add that with DataStax you get near real-time replication with Solr.
  • iMysak
    iMysak about 8 years
    as I know Pivot faceting was added in SolrCloud in 4.10 issues.apache.org/jira/browse/SOLR-2894
  • forsberg
    forsberg over 7 years
    Why do you recommend Elastic for new projects?
  • forsberg
    forsberg over 7 years
    So, no cons, only pros for Elastic Search now? Or you didn't include something worth mentioning about the disatvantages?
  • Behzad Qureshi
    Behzad Qureshi over 7 years
    Elastic search is new so it is using latest technologies/architecture.
  • Gus
    Gus almost 7 years
    The data schema row is a bit misleading... Elastic has Mappings which are essentially a schema (but not required by default). Solr ships such that one has to install configuration before it will work, there are several supplied example configurations that you can choose from immediately and one is schemaless, though carefully controlled schemas are probably more common when using solr.
  • Jan Sommer
    Jan Sommer almost 7 years
    I could also create something new but just because I use new technology or a different architecture, it doesn't mean it's better than what's already on the market.
  • Behzad Qureshi
    Behzad Qureshi almost 7 years
    Agreed but as an architect, you will definitely go for better than what's already in the market. My 2 cents :)
  • whomer
    whomer over 6 years
    The Solr Streaming API provides MapReduce capabilities
  • David Thomas
    David Thomas about 6 years
    Interesting, I've just been evaluating Solr and Elasticsearch and found indexing the same set of 1M documents took twice as long for Elasticsearch compared to Solr.
  • Ajax
    Ajax over 4 years
  • lucaswxp
    lucaswxp about 3 years
    3 years later, this still holds true? 10 times seems like a awful lot, like the kind of thing that could be addressed by better customization?
  • Glenn
    Glenn about 3 years
    I have not re-run these tests recently. Everything is in github.com/gengstrand/clojure-news-feed so feel free to spin it all up and test for yourself. If you do, then perhaps you could share your results here?