Lab 2.1: Strings in Elasticsearch

Objective:

In this lab, you will learn about the differences between text and keyword.

  1. Write a match query on the blogs index that searches for blogs with the name Steve as the authors.first_name. You should get 135 hits.

    Solution
    GET blogs/_search
    {
      "query": {
        "match": {
          "authors.first_name": "Steve"
        }
      }
    }
    
  2. Update the previous query to search for steve instead of Steve. How many hits are you expecting?

    Solution

    You should have the exact same number of hits.

    GET blogs/_search
    {
      "query": {
        "match": {
          "authors.first_name": "steve"
        }
      }
    }
    

  3. Why is this match query case-insensitive?

    Solution

    The previous query has been run on a text field and text fields are analyzed. By default, the text analysis lowercases the content from text fields.

  4. Update the query, to use the authors.first_name.keyword instead of the authors.first_name field.

    Solution
    GET blogs/_search
    {
      "query": {
        "match": {
          "authors.first_name.keyword": "steve"
        }
      }
    }
    
  5. Why do you have zero result?

    Solution

    The query is now performed on a keyword field and keyword fields are NOT analyzed. To match a keyword term, you should have an exact match.

  6. Using they keyword field, update the query to get the same blogs as the first query.

    Solution

    You should capitalize the name Steve.

    GET blogs/_search
    {
      "query": {
        "match": {
          "authors.first_name.keyword": "Steve"
        }
      }
    }
    

  7. Next, open Discover to visualize the blogs dataset. Make sure to set the data view to blogs and to set the time filter to the last 13 years of blogs, and you should see 4,719 hits.

  8. On the left hand side, find the field authors.first_name and click "Add field as column". "Add field"

  9. Notice that you can sort by publish_date but you don't have the option to do the same with the field authors.first_name. This is because you can't sort results using a text field. "Sort publish date"

  10. Remove the authors.first_name field and add the keyword version of this field. You will need to first click on the name of the field and find its multi-field. "Add keyword field"

  11. You now have the possibility to sort by authors.first_name. Remove the sorting from publish_date (by clicking on the small arrow next to the name of the field) and sort by authors.first_name. "Sort by name"

Summary:

In this lab, you wrote a couple of queries in Console and saw the difference between text and keyword.