logstash output to elasticsearch with document_id; what to do when I don't have a document_id?
Solution 1
You're close with the conditional idea but you can't place it inside a plugin block. Do this instead:
output {
if [document_id] {
elasticsearch_http {
host => "127.0.0.1"
document_id => "%{document_id}"
}
} else {
elasticsearch_http {
host => "127.0.0.1"
}
}
}
(But the suggestion in one of the other answers to use the uuid filter is good too.)
Solution 2
One way to solve this is to make sure a document_id
is always available. You can achieve this by adding a UUID filter in the filter section that would create the document_id
field if it is not present.
filter {
if "" in [document_id] {
uuid {
target => "document_id"
}
}
}
Edited per Magnus Bäck's suggestion. Thanks!
Comments
-
tedder42 almost 2 years
I have some logstash input where I use the
document_id
to remove duplicates. However, most input doesn't have adocument_id
. The following plumbs the actualdocument_id
through, but if it doesn't exist, it gets accepted as literally%{document_id}
, which means most documents are seen as a duplicate of each other. Here's what my output block looks like:output { elasticsearch_http { host => "127.0.0.1" document_id => "%{document_id}" } }
I thought I might be able to use a conditional in the output. It fails, and the error is given below the code.
output { elasticsearch_http { host => "127.0.0.1" if document_id { document_id => "%{document_id}" } } } Error: Expected one of #, => at line 101, column 8 (byte 3103) after output { elasticsearch_http { host => "127.0.0.1" if
I tried a few "if" statements and they all fail, which is why I assume the problem is having a conditional of any sort in that block. Here are the alternatives I tried:
if document_id <> "" { if [document_id] <> "" { if [document_id] { if "hello" <> "" {