elasticsearch - Aggregation returns terms in key , but not the complete field, how can I get full field returned?

16,955

You need to have untokenized copies of the terms in the index, in your mapping use multi-fields:

{
    "test": {
        "mappings": {
            "book": {
                "properties": {                
                    "author": {
                        "type": "string",
                        "fields": {
                            "untouched": {
                                "type": "string",
                                "index": "not_analyzed"
                            }
                        }
                    },
                    "title": {
                        "type": "string",
                        "fields": {
                            "untouched": {
                                "type": "string",
                                "index": "not_analyzed"
                            }
                        }
                    },
                    "docType": {
                        "type": "string",
                        "fields": {
                            "untouched": {
                                "type": "string",
                                "index": "not_analyzed"
                            }
                        }
                    }
                }
            }
        }
    }
}

In your aggregation query reference the untokenized fields:

"aggs" : {
    "author" : {
         "terms" : { 
            "field" : "author.untouched", 
            "size": 20,
            "order" : { "_term" : "asc" }
        }
     },
    "title" : {
        "terms" : { 
          "field" : "title.untouched", 
          "size": 20
        }
    },
    "contentType" : {
        "terms" : { 
           "field" : "docType.untouched", 
           "size": 20
        }
    }
}
Share:
16,955
dev123
Author by

dev123

Updated on June 13, 2022

Comments

  • dev123
    dev123 about 2 years

    In the elasticsearch implementation , I have few simple aggregations on the basis of few fields as shown below -

     "aggs" : {
        "author" : {
            "terms" : { "field" : "author" 
              , "size": 20,
              "order" : { "_term" : "asc" }
            }
        },
        "title" : {
            "terms" : { "field" : "title" 
              , "size": 20
            }
        },
        "contentType" : {
            "terms" : { "field" : "docType" 
              , "size": 20
            }
        }
    }
    

    The aggregations work fine and I get the results accordingly. but the title key field returned (or any other field - multi word) , has single word aggregation and results. I need the full title in the returned result, rather then just a word- which doesn't make much sense. how can I get that.

    Current results (just a snippet) -

    "title": {
         "buckets": [
            {
               "key": "test",
               "doc_count": 1716
            },
            {
               "key": "pptx",
               "doc_count": 1247
            },
            {
               "key": "and",
               "doc_count": 661
            },
            {
               "key": "for",
               "doc_count": 489
            },
            {
               "key": "mobile",
               "doc_count": 487
            },
            {
               "key": "docx",
               "doc_count": 486
            },
            {
               "key": "pdf",
               "doc_count": 450
            },
            {
               "key": "2012",
               "doc_count": 397
            } ] }
    

    expected results -

    "title": {
             "buckets": [
                {
                   "key": "test document for stack overflow ",
                   "doc_count": 1716
                },
                {
                   "key": "this is a pptx",
                   "doc_count": 1247
                },
                {
                   "key": "its another document and so on",
                   "doc_count": 661
                },
                {
                   "key": "for",
                   "doc_count": 489
                },
                {
                   "key": "mobile",
                   "doc_count": 487
                },
                {
                   "key": "docx",
                   "doc_count": 486
                },
                {
                   "key": "pdf",
                   "doc_count": 450
                },
                {
                   "key": "2012",
                   "doc_count": 397
                } }
    

    I went through a lot of documentation, it explains different ways to aggregate results, but I couldn't find how to get the full text if a field in key in result , please advise how can I achieve this?