How to store data in elasticsearch _source but not index it?
By default the _source
of the document is stored regardless of the fields that you choose to index. The _source
is used to return the document in the search results, whereas the fields that are indexed are used for searching.
You can't set index: no
on an object to prevent all fields in an object being indexed, but you can do what you want with Dynamic Templates using path_match
property to apply the index: no
setting to every field within an object. Here is a simple example.
Create an index with your mapping that includes the dynamic templates for the author
object and the nested categories
object:
POST /shop
{
"mappings": {
"book": {
"dynamic_templates": [
{
"author_object_template": {
"path_match": "author.*",
"mapping": {
"index": "no"
}
}
},
{
"categories_object_template": {
"path_match": "categories.*",
"mapping": {
"index": "no"
}
}
}
],
"properties": {
"categories": {
"type": "nested"
}
}
}
}
}
Index a document:
POST /shop/book/1
{
"title": "book one",
"author": {
"first_name": "jon",
"last_name": "doe"
},
"categories": [
{
"cat_id": 1,
"cat_name": "category one"
},
{
"cat_id": 2,
"cat_name": "category two"
}
]
}
If you searched on the title
field with the search term book
the document would be returned. If you search on the author.first_name
or author.last_name
, there won't be a match because this fields were not indexed:
POST /shop/book/_search
{
"query": {
"match": {
"author.first_name": "jon"
}
}
}
The same would be the case for a nested query on the category fields:
POST /shop/book/_search
{
"query": {
"nested": {
"path": "categories",
"query": {
"match": {
"categories.cat_name": "category"
}
}
}
}
}
Also you can use the Luke tool to expect the Lucene index and see what fields have been indexed.
Related videos on Youtube
pinkeen
Updated on October 09, 2022Comments
-
pinkeen about 1 year
I am searching only by couple of fields but I want to be able to store the whole document in ES in order not to additional DB (MySQL) queries.
I tried adding
index: no
,store: no
to whole objects/properties in the mapping but I'm still not sure if the fields are being indexed and add unnecessary overhead.Let's say I've got books and each has an author. I want to search only by book title, but I want to be able to retrieve the whole document.
Is this okay:
mappings: properties: title: type: string index: analyzed author: type: object index: no store: no properties: first_name: type: string last_name: type: string
Or should I rather do:
mappings: properties: title: type: string index: analyzed author: type: object properties: first_name: index: no store: no type: string last_name: index: no store: no type: string
Or maybe I am doing it completely wrong? And what about
nested
properties that should not be indexed? -
pinkeen over 8 yearsDoes
"index": "no"
imply"store": "no"
? I've readstore
means storing the original property's_source
in lucene but I'm not sure how it is related toindex
. And just to make sure - I don't have to provide mappings for the non-indexed fields? ES won't throw errors if I put a document with property X that is an int and then a document with the same property but with string? -
Dan Tuffery over 8 yearsNo, the setting for index does not determine the setting for store. The default for store is no, which is fine in your use case because the _source is enabled. If you disabled the _source field and select the fields you want to store, the stored fields will only be returned in the search results when there is a match. You have to provide mapping for non indexed fields in order to tell Elasticsearch not to index them, otherwise Elasticsearch will use the default analyzer (Standard Analyzer) to index the field.
-
Dan Tuffery over 8 yearsHowever, in the above example the dynamic templates are used for the mappings of non indexed fields. If you don't have a mapping for a field an error won't be returned if you change the type of a property.