Logging Search Keywords in Solr / Lucene
Solution 1
I think it depends on what you are looking to log? Are you just looking to record the queries users are submitting as well as the results? If it's just "what are folks searching for" then you have that data in the q parameter that is logged by the servlet container. If you are using the default Jetty setup, look at ./logs/*request.log. You will see lines like:
0:0:0:0:0:0:0:1%0 - - [21/01/2010:15:08:29 +0000] "GET /solr/select/?q=*:*&qt=geo&lat=45&long=15&radius=10 HTTP/1.1" 200 197
In this case, you can parse out that the user was doing a q=: search! Use a tool like AWStats to parse your logs and do the analysis. It's at least a quick and easy way to get some information!
Solution 2
Months later ... maybe someone is interested:
http://karussell.wordpress.com/2010/10/27/feeding-solr-with-its-own-logs/
(you'll need to adapt the log parser if you are not using the default solr output format)
Solution 3
You can look at something like logstash to parse your log data.
Solution 4
The SolrLogging wiki page says you can use JDK logging (in Solr 1.0 to 1.3) or slf4j logging in Solr 1.4. About your own Solr analyzer - it depends on your needs. In many cases, using your own analyzer helps for specific retrieval requirements.
Comments
-
Ryall almost 2 years
I'm new to Solr and am looking for a way to record searches (or keywords) to a log file or database so that I can then analyse for data visualisation.
- Can Solr do this already?
- Is this data accessible via. a Solr query?
Thanks.
Update 1
I'm starting to think I might need to write my own Solr analyzer?
-
memnoch_proxy over 14 yearsThis would be very quick to write a script for, if one is not inclined to write java code (wrt @yuval answer).
-
memnoch_proxy over 14 yearsThis would be very appropriate because you might want to use jdbc to record those findings. Depending on the ferocity of the traffic, making a synchronous jdbc call during a search might be a bottleneck, though. Might be better to log to a file and parse in a separate process.
-
RngTng over 12 yearsWhat are the last two numbers (Here: '200 197') of the log entry? - Thx
-
Eric Pugh over 12 yearsRngTng, the numbers is the response code, HTTP 200, which means all okay, and 197 is the time (I think) in milliseconds. Might be size of results as well, not sure.
-
sunskin over 10 years@EricPugh Found the request.log but that file is empty. Please advise.
-
cheffe almost 7 yearsHere is a good write up blog.comperiosearch.com/blog/2015/09/21/solr-logstash-analysis