Logging Search Keywords in Solr / Lucene

10,967

Solution 1

I think it depends on what you are looking to log? Are you just looking to record the queries users are submitting as well as the results? If it's just "what are folks searching for" then you have that data in the q parameter that is logged by the servlet container. If you are using the default Jetty setup, look at ./logs/*request.log. You will see lines like:

0:0:0:0:0:0:0:1%0 -  -  [21/01/2010:15:08:29 +0000] "GET /solr/select/?q=*:*&qt=geo&lat=45&long=15&radius=10 HTTP/1.1" 200 197 

In this case, you can parse out that the user was doing a q=: search! Use a tool like AWStats to parse your logs and do the analysis. It's at least a quick and easy way to get some information!

Solution 2

Months later ... maybe someone is interested:

http://karussell.wordpress.com/2010/10/27/feeding-solr-with-its-own-logs/

(you'll need to adapt the log parser if you are not using the default solr output format)

Solution 3

You can look at something like logstash to parse your log data.

Solution 4

The SolrLogging wiki page says you can use JDK logging (in Solr 1.0 to 1.3) or slf4j logging in Solr 1.4. About your own Solr analyzer - it depends on your needs. In many cases, using your own analyzer helps for specific retrieval requirements.

Share:
10,967
Ryall
Author by

Ryall

Does web-stuff on a web-thing in the web-verse.

Updated on June 03, 2022

Comments

  • Ryall
    Ryall almost 2 years

    I'm new to Solr and am looking for a way to record searches (or keywords) to a log file or database so that I can then analyse for data visualisation.

    • Can Solr do this already?
    • Is this data accessible via. a Solr query?

    Thanks.


    Update 1

    I'm starting to think I might need to write my own Solr analyzer?

  • memnoch_proxy
    memnoch_proxy over 14 years
    This would be very quick to write a script for, if one is not inclined to write java code (wrt @yuval answer).
  • memnoch_proxy
    memnoch_proxy over 14 years
    This would be very appropriate because you might want to use jdbc to record those findings. Depending on the ferocity of the traffic, making a synchronous jdbc call during a search might be a bottleneck, though. Might be better to log to a file and parse in a separate process.
  • RngTng
    RngTng over 12 years
    What are the last two numbers (Here: '200 197') of the log entry? - Thx
  • Eric Pugh
    Eric Pugh over 12 years
    RngTng, the numbers is the response code, HTTP 200, which means all okay, and 197 is the time (I think) in milliseconds. Might be size of results as well, not sure.
  • sunskin
    sunskin over 10 years
    @EricPugh Found the request.log but that file is empty. Please advise.
  • cheffe
    cheffe almost 7 years