logstash output to elasticsearch with document_id; what to do when I don't have a document_id?

11,108

Solution 1

You're close with the conditional idea but you can't place it inside a plugin block. Do this instead:

output {
  if [document_id] {
    elasticsearch_http {
      host => "127.0.0.1"
      document_id => "%{document_id}"
    } 
  } else {
    elasticsearch_http {
      host => "127.0.0.1"
    } 
  }
}

(But the suggestion in one of the other answers to use the uuid filter is good too.)

Solution 2

One way to solve this is to make sure a document_idis always available. You can achieve this by adding a UUID filter in the filter section that would create the document_id field if it is not present.

filter {
    if "" in [document_id] {
        uuid {
            target => "document_id"
        }
    }
}

Edited per Magnus Bäck's suggestion. Thanks!

Share:
11,108
tedder42
Author by

tedder42

Almost 100% human. Work remote for a startup.

Updated on June 11, 2022

Comments

  • tedder42
    tedder42 almost 2 years

    I have some logstash input where I use the document_id to remove duplicates. However, most input doesn't have a document_id. The following plumbs the actual document_id through, but if it doesn't exist, it gets accepted as literally %{document_id}, which means most documents are seen as a duplicate of each other. Here's what my output block looks like:

    output {
            elasticsearch_http {
                host => "127.0.0.1"
                document_id => "%{document_id}"
            }
    }
    

    I thought I might be able to use a conditional in the output. It fails, and the error is given below the code.

    output {
            elasticsearch_http {
                host => "127.0.0.1"
                if document_id {
                    document_id => "%{document_id}"
                } 
            }
    }
    
    Error: Expected one of #, => at line 101, column 8 (byte 3103) after output {
            elasticsearch_http {
        host => "127.0.0.1"
        if 
    

    I tried a few "if" statements and they all fail, which is why I assume the problem is having a conditional of any sort in that block. Here are the alternatives I tried:

    if document_id <> "" {
    if [document_id] <> "" {
    if [document_id] {
    if "hello" <> "" {