Elastic search alphabetical sorting based on first character

18,130

Solution 1

I had a similar issue and the other answer didn't quite get it for me. I referred to this documentation instead, and was able to solve by mapping like this

"name": { 
    "type":     "string",
    "analyzer": "english",
    "fields": {
        "raw": { 
            "type":  "string",
            "index": "not_analyzed"
        }
    }
}

and then querying and sorting like this

{
    "query": {
        "match": {
            "name": "dhoni"
        }
    },
    "sort": {
        "name.raw": {
            "order": "asc"
        }
    }
}

Solution 2

I think problem is that, your string is analyzed on writing to elasticsearch. It use Standard Analyzer, An analyzer of type standard is built using the Standard Tokenizer with the Standard Token Filter, Lower Case Token Filter, and Stop Token Filter.

What does this mean, suppose you are using a field "name", with default mapping (standard analyzer).

when you index,

team dhoni, --> team, dhoni

dhoni1 --> dhoni1

dibeesh 200 --> dibeesh, 200

and so on,

so, by sorting it is obvious that dibeesh200 will come first. (because it will sort by 200 not dibesh)

So, If your string is not analyzed (upper case and lower case acts differently) or you may use simple analyzer (so that you can sort by letters only and doesn't matter upper case or lower) , or maybe you can use multifield to have analyzed and non_analyzed version.

Here is a way to do that,

POST x2/x3/_mapping
{
    "x3":{
        "properties": {
            "name" :{
                "type" :"string",
                "fields" :{
                    "raw" :{
                        "type": "string",
                        "index_analyzer": "simple"
                    }
                }
            }
        }
    }
}

And here is the query,

POST x2/x3/_search
{
    "sort": [
       {
          "name.raw": {
             "order": "asc"
          }
       }
    ]
} 

This works as expected. Hope this helps!!

Solution 3

The keyword analyzer helped me:

first_name: {
     type: "text",
     analyzer: "keyword"
}

Docs

Solution 4

The Difference in ASCII value cause difference in upper and lowercase start.So one solution (trick) is just save the same data which you wanted to sort in lowercase in some other field name.And use that field for sort.

This is not the perfect way, but while sorting data for drop down menus. this will help.

Share:
18,130
Dibish
Author by

Dibish

Software Engineer, AMT

Updated on June 05, 2022

Comments

  • Dibish
    Dibish almost 2 years

    I have a collection of first names.

    team dhoni
    dhoni1
    dibeesh 200
    bb vineesh
    devan
    

    I want to sort it alphabetically ascending order (A - Z) like the following order

    bb vineesh
    devan
    dhoni1
    dibeesh 200
    team dhoni
    

    Mapping

     "first_name": {
          "type": "string",
          "store": "true"
    },
    

    I have tried

    {
      "sort": [
        {
          "first_name": {
            "order": "asc"
    
          }
        }
      ], 
     "query": {
        "match_all": {
        }
      }
    }
    

    When i run this query am getting the names in following order.

    dibeesh 200
    bb vineesh
    devan
    team dhoni
    dhoni1
    

    Elastic search taking first names with number as first preference.

    How can I prevent this?