Lab 4.2: Enriching data
Objective:
In this lab, you will add a new field to the blogs_fixed2 index. This field is populated by using the enrich processor to perform a lookup of data in a separate index.
-
Run a
termsaggregation on thecategoryfield of theblogs_fixed2index. Notice you get 5 values that are not human-friendly. They are unique IDs that are designed to map to data in another data source.Solution
GET blogs_fixed2/_search { "size": 0, "aggs": { "NAME": { "terms": { "field": "category", "size": 10 } } } } -
Let's create a new index that maps the ID to its actual category name. Run the following
_bulkcommand, which creates a new index namedcategories:POST categories/_bulk {"create":{}} {"uid": "blt26ff0a1ade01f60d","title":"User Stories"} {"create":{}} {"uid": "bltfaae4466058cc7d6","title": "Releases"} {"create":{}} {"uid": "bltc253e0851420b088","title": "Culture"} {"create":{}} {"uid": "blt0c9f31df4f2a7a2b","title": "News"} {"create":{}} {"uid": "blt1d90b8e0edce3ea9","title": "Engineering"} -
EXAM PREP: Create an enrich policy that satisfies the following requirements.
- the name of the policy is
categories_policy - the match field is the
uidfield of thecategoriesindex - the enrich field is the
titlefield
Solution
PUT _enrich/policy/categories_policy { "match": { "indices": "categories", "match_field": "uid", "enrich_fields": ["title"] } } - the name of the policy is
-
Execute the policy to create an enrich index.
Solution
POST _enrich/policy/categories_policy/_execute -
EXAM PREP: Create a new ingest pipeline that satisfies the following requirements:
- the name of the pipeline is
categories_pipeline - uses an enrich processor with the
categories_policypolicy. Maps the existingcategoryfield to the enrich policy and enriches a new field namedcategory_title - removes the original
categoryfield - both the enrich processor and remove processor should ignore documents that don't have a
categoryfield
Solution
Use the Ingest Node Pipeline UI to define the pipeline in Kibana. If you want to skip that step you can copy-and-paste the following PUT command into Console:
PUT _ingest/pipeline/categories_pipeline { "processors": [ { "enrich": { "field": "category", "policy_name": "categories_policy", "target_field": "category_title", "ignore_missing": true } }, { "remove": { "field": "category", "ignore_missing": true } } ] } - the name of the pipeline is
-
Add object
category_titlewith fieldstitleanduid(both of typekeyword) to theblogs_fixed2index mapping.Solution
PUT blogs_fixed2/_mapping { "properties": { "category_title": { "properties": { "title": { "type": "keyword" }, "uid": { "type": "keyword" } } } } } -
Using
_update_by_query, run all the documents inblogs_fixed2through yourcategories_pipeline.POST blogs_fixed2/_update_by_query?pipeline=categories_pipeline&wait_for_completion=false -
Run a
termsaggregation on thecategory_title.titlefield and verify you enriched the index.GET blogs_fixed2/_search { "size": 0, "aggs": { "blogs_by_category": { "terms": { "field": "category_title.title", "size": 10 } } } }
Summary:
In this lab, you also learned how to enrich an index with data from another index using an enrich policy and the enrich processor.