Migrate field type from text to keyword on Elasticsearch

23,891

Solution 1

OK finaly i see in the doc that it's not possible to change data type of a field :

Updating existing mappings

Other than where documented, existing type and field mappings cannot be updated. Changing the mapping would mean invalidating already indexed documents. Instead, you should create a new index with the correct mappings and reindex your data into that index.

So the only solution is to :

  • Recreate a new index with good data types
  • Reindex the data with the Reindex API

Solution 2

Changing the data types (mappings) of an existing index is not supported. in order to do so, create a new index with the correct types (mappings) and reindex your data.

Elastic blogpost about doing this and recommend using a best-practice approach of aliasing your index.


If you DO NOT 🔴 have an aliased index

These are the steps required, the next time will be easier with no downtime

  1. Get current mapping of StatCateg
  2. Create a new index StatCateg_v1 with the correct mappings
  3. Reindex from StatCateg to StatCateg_v1
  4. Delete the old index StatCateg
  5. Create an alias StatCateg_v1 -> StatCateg (so that on next time this will be easier to do without downtime)

Example snippet (in python):

import requests

current_index_name = 'StatCateg'
new_index_name = 'StatCateg-v1'
base_url = 'https://...'
mapping_changes = {
    "nom_categorie": {"type": "keyword"}
}

# ------------------------------------------------
# Get current mapping
r = requests.get('{base_url}/{index_name}'.format(base_url=base_url, index_name=current_index_name))
r.raise_for_status()
content = r.json()
mappings = content[current_index_name]['mappings']
mappings['properties'].update(mapping_changes)

# ------------------------------------------------
# Create a new index with the correct mappings
r = requests.put('{base_url}/{index_name}'.format(base_url=base_url, index_name=new_index_name), json={
    'mappings': mappings
})
r.raise_for_status()

# ------------------------------------------------
# Reindex
r = requests.post('{base_url}/_reindex'.format(base_url=base_url), json={
    "source": {
        "index": current_index_name
    },
    "dest": {
        "index": new_index_name
    }
})
r.raise_for_status()

# ------------------------------------------------
# Delete the old index
r = requests.delete('{base_url}/{index_name}'.format(base_url=base_url, index_name=current_index_name))
r.raise_for_status()

# ------------------------------------------------
# Create an alias (so that on next time this will be easier to do without downtime)
r = requests.post('{base_url}/_aliases'.format(base_url=base_url), json={
    "actions": [
        {"add": {
            "alias": current_index_name,
            "index": new_index_name
        }}
    ]
})
r.raise_for_status()

In case you DO ✅ have an aliased index

These are the steps required, no downtime

  1. Get current mapping of StatCateg_v1
  2. Create a new index StatCateg_v2 with the correct mappings
  3. Reindex from StatCateg_v1 to StatCateg_v2
  4. Swap aliases (StatCateg_v1 -> StatCateg) with (StatCateg_v2 -> StatCateg)
  5. Delete the old index StatCateg_v1

Example snippet (in python):

import requests

index_name = 'StatCateg'
current_index_name = 'StatCateg_v1'
next_index_name = 'StatCateg_v2'
base_url = 'https://...'
mapping_changes = {
    "nom_categorie": {"type": "keyword"}
}

# ------------------------------------------------
# Get current mapping
r = requests.get('{base_url}/{index_name}'.format(base_url=base_url, index_name=current_index_name))
r.raise_for_status()
content = r.json()
mappings = content[current_index_name]['mappings']
mappings['properties'].update(mapping_changes)

# ------------------------------------------------
# Create a new index with the correct mappings
r = requests.put('{base_url}/{index_name}'.format(base_url=base_url, index_name=next_index_name), json={
    'mappings': mappings
})
r.raise_for_status()

# ------------------------------------------------
# Reindex
r = requests.post('{base_url}/_reindex'.format(base_url=base_url), json={
    "source": {
        "index": current_index_name
    },
    "dest": {
        "index": next_index_name
    }
})
r.raise_for_status()

# ------------------------------------------------
# Replace old index alias with new  
r = requests.post('{base_url}/_aliases'.format(base_url=base_url), json={
    "actions": [
        {"remove": {
            "alias": index_name,
            "index": current_index_name
        }},
        {"add": {
            "alias": index_name,
            "index": next_index_name
        }}
    ]
})
r.raise_for_status()

# ------------------------------------------------
# Delete the old index
r = requests.delete('{base_url}/{index_name}'.format(base_url=base_url, index_name=current_index_name))
r.raise_for_status()
Share:
23,891
C.Rouillon
Author by

C.Rouillon

Updated on July 05, 2022

Comments

  • C.Rouillon
    C.Rouillon almost 2 years

    When I want to change the type of a field from text to keyword with this commande :

    PUT indexStat/_mapping/StatCateg
    {
      "StatCateg":{
        "properties": {
          "nom_categorie": {
            "type":"keyword","index": true
          }
        }
      }
    }
    

    I have this message :

    {
      "error": {
        "root_cause": [
          {
            "type": "illegal_argument_exception",
            "reason": "mapper [nom_categorie] of different type, current_type [text], merged_type [keyword]"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "mapper [nom_categorie] of different type, current_type [text], merged_type [keyword]"
      },
      "status": 400
    }