Lab 3.3: Developing search applications

Objective:

In this lab, you will see how you can use search templates to decouple client applications from the queries they need to execute frequently. You will also run an asynchronous search.

  1. Let's start with the query for which we want to create a template. Write a search on the blogs_fixed2 index that finds the blogs for a one-week period starting April 1, 2021. (Hint: When anchoring dates, a || needs to be appended as a separator between the anchor and the range.)

    Solution
    GET blogs_fixed2/_search
    {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "publish_date": {
                  "gte": "2021-04-01",
                  "lt": "2021-04-01||+1w"
                }
              }
            }
          ]
        }
      }
    }
    
  2. EXAM PREP: Create a search template called weekly_blogs that satisfies the following requirement:

    • the starting date for the one-week period is a parameter named start_date
    Solution
    PUT _scripts/weekly_blogs
    {
      "script": {
        "lang": "mustache",
        "source": {
          "query": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "publish_date": {
                      "gte": "{{start_date}}",
                      "lt": "{{start_date}}||+1w"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
    
  3. Verify that your search template works by writing a query that returns the blogs for the week of April 1, 2021. You should get the same response as the query from Step 1.

    Solution
    GET blogs_fixed2/_search/template
    {
      "id": "weekly_blogs",
      "params": {
        "start_date": "2021-04-01"
      }
    }
    
  4. EXAM PREP: Define a new search template named top_blogs similar to the weekly_blogs template that satisfies the following requirements:

    • the date range is flexible using start_date and end_date parameters
    • if an end_date parameter is not provided in the search, then search for one week of blogs
    Solution
    PUT _scripts/top_blogs
    {
      "script": {
        "lang": "mustache",
        "source": {
          "query": {
            "bool": {
              "filter": [
                {
                  "range": {
                    "publish_date": {
                      "gte": "{{start_date}}",
                      "lt": "{{end_date}}{{^end_date}}{{start_date}}||+1w{{/end_date}}"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
    
  5. Verify your template is working by writing a query using the top_blogs template that returns the blogs from April 10, 2021, to April 15, 2021.

    Solution
    GET blogs_fixed2/_search/template
    {
      "id": "top_blogs",
      "params": {
        "start_date": "2021-04-10",
        "end_date": "2021-04-15"
      }
    }
    
  6. Verify that you can send a query without an end_date by removing the end_date parameter from your previous query to get the blogs for the week of April 10, 2021.

    Solution
    GET blogs_fixed2/_search/template
    {
      "id": "top_blogs",
      "params": {
        "start_date": "2021-04-10"
      }
    }
    
  7. Write a query that search for the terms security in the title. Highlight the matching term and surround it by <strong> and </strong> tags

    Solution
    GET blogs_fixed2/_search
    {
      "query": {
        "match": {
          "title": "security"
        }
      },
      "highlight": {
        "fields": {
          "title": {}
        },
        "pre_tags": [
          "<strong>"
        ],
        "post_tags": [
          "</strong>"
        ]
      }
    }
    
  8. Now, let's try to run an asynchronous search. Run the following function_score query. (Use function_score to customize how Elasticsearch computes the _score for each document.)

    GET blogs_fixed2/_search
    {
      "query": {
        "function_score": {
          "query": {
            "match": {
              "content": "to the blog and your query: you are both enjoying being on Elasticsearch "
            }
          },
          "script_score": {
            "script": """
            int m = 1; 
            double u = 1.0;
            for (int x = 0; x < m; ++x) 
              for (int y = 0; y < 10000; ++y) 
                u=Math.log(y);
            return u
            """
          }
        }
      }
    }
    

  9. Take a look at the took time after executing this query. Increasing the value of m increases the took time. So if m = 1 takes about 1000ms, then m=30 should take about 30000ms or 30 seconds. Figure out what m value you would need for this query to take approximately 30 seconds.

  10. Run an asynchronous search request on this query so that you can let the query run in the background and retrieve its results later.

    POST blogs_fixed2/_async_search
    {
      "query": {
        "function_score": {
          "query": {
            "match": {
              "content": "to the blog and your query: you are both enjoying being on Elasticsearch "
            }
          },
          "script_score": {
            "script": """
            int m = 30; 
            double u = 1.0;
            for (int x = 0; x < m; ++x) 
              for (int y = 0; y < 10000; ++y) 
                u=Math.log(y);
            return u
            """
          }
        }
      }
    }
    

  11. Copy the id value from the response and retrieve the search results.

    Solution

    The command will be similar to the following:

    GET _async_search/<your_search_ID_here>
    

  12. If you sent your request before your query has finished running, you'd notice that the hits.hits in your response are still empty. Wait about 30 seconds and send the request again. Eventually, the is_running value will be false, and the hits.hits will contain some documents.

  13. Wait another 10 seconds and retrieve the search results again. You can still see the results! This is because the results are stored on your cluster until the expiration_time_in_millis or until you manually delete it yourself. To clear up the space, it is taking up on your disk, delete the results of this search.

    DELETE _async_search/<your_search_ID_here>
    

Summary:

In this lab, you learned how to create search templates, and use async searches.