Lab 3.3: Developing search applications
Objective:
In this lab, you will see how you can use search templates to decouple client applications from the queries they need to execute frequently. You will also run an asynchronous search.
-
Let's start with the query for which we want to create a template. Write a search on the
blogs_fixed2index that finds the blogs for a one-week period starting April 1, 2021. (Hint: When anchoring dates, a||needs to be appended as a separator between the anchor and the range.)Solution
GET blogs_fixed2/_search { "query": { "bool": { "filter": [ { "range": { "publish_date": { "gte": "2021-04-01", "lt": "2021-04-01||+1w" } } } ] } } } -
EXAM PREP: Create a search template called
weekly_blogsthat satisfies the following requirement:- the starting date for the one-week period is a parameter named
start_date
Solution
PUT _scripts/weekly_blogs { "script": { "lang": "mustache", "source": { "query": { "bool": { "filter": [ { "range": { "publish_date": { "gte": "{{start_date}}", "lt": "{{start_date}}||+1w" } } } ] } } } } } - the starting date for the one-week period is a parameter named
-
Verify that your search template works by writing a query that returns the blogs for the week of April 1, 2021. You should get the same response as the query from Step 1.
Solution
GET blogs_fixed2/_search/template { "id": "weekly_blogs", "params": { "start_date": "2021-04-01" } } -
EXAM PREP: Define a new search template named
top_blogssimilar to theweekly_blogstemplate that satisfies the following requirements:- the date range is flexible using
start_dateandend_dateparameters - if an
end_dateparameter is not provided in the search, then search for one week of blogs
Solution
PUT _scripts/top_blogs { "script": { "lang": "mustache", "source": { "query": { "bool": { "filter": [ { "range": { "publish_date": { "gte": "{{start_date}}", "lt": "{{end_date}}{{^end_date}}{{start_date}}||+1w{{/end_date}}" } } } ] } } } } } - the date range is flexible using
-
Verify your template is working by writing a query using the
top_blogstemplate that returns the blogs from April 10, 2021, to April 15, 2021.Solution
GET blogs_fixed2/_search/template { "id": "top_blogs", "params": { "start_date": "2021-04-10", "end_date": "2021-04-15" } } -
Verify that you can send a query without an
end_dateby removing theend_dateparameter from your previous query to get the blogs for the week of April 10, 2021.Solution
GET blogs_fixed2/_search/template { "id": "top_blogs", "params": { "start_date": "2021-04-10" } } -
Write a query that search for the terms
securityin thetitle. Highlight the matching term and surround it by<strong>and</strong>tagsSolution
GET blogs_fixed2/_search { "query": { "match": { "title": "security" } }, "highlight": { "fields": { "title": {} }, "pre_tags": [ "<strong>" ], "post_tags": [ "</strong>" ] } } -
Now, let's try to run an asynchronous search. Run the following
function_scorequery. (Usefunction_scoreto customize how Elasticsearch computes the_scorefor each document.)GET blogs_fixed2/_search { "query": { "function_score": { "query": { "match": { "content": "to the blog and your query: you are both enjoying being on Elasticsearch " } }, "script_score": { "script": """ int m = 1; double u = 1.0; for (int x = 0; x < m; ++x) for (int y = 0; y < 10000; ++y) u=Math.log(y); return u """ } } } } -
Take a look at the
tooktime after executing this query. Increasing the value ofmincreases thetooktime. So ifm = 1takes about 1000ms, thenm=30should take about 30000ms or 30 seconds. Figure out whatmvalue you would need for this query to take approximately 30 seconds. -
Run an asynchronous search request on this query so that you can let the query run in the background and retrieve its results later.
POST blogs_fixed2/_async_search { "query": { "function_score": { "query": { "match": { "content": "to the blog and your query: you are both enjoying being on Elasticsearch " } }, "script_score": { "script": """ int m = 30; double u = 1.0; for (int x = 0; x < m; ++x) for (int y = 0; y < 10000; ++y) u=Math.log(y); return u """ } } } } -
Copy the
idvalue from the response and retrieve the search results.Solution
The command will be similar to the following:
GET _async_search/<your_search_ID_here> -
If you sent your request before your query has finished running, you'd notice that the
hits.hitsin your response are still empty. Wait about 30 seconds and send the request again. Eventually, theis_runningvalue will befalse, and thehits.hitswill contain some documents. -
Wait another 10 seconds and retrieve the search results again. You can still see the results! This is because the results are stored on your cluster until the
expiration_time_in_millisor until you manually delete it yourself. To clear up the space, it is taking up on your disk, delete the results of this search.DELETE _async_search/<your_search_ID_here>
Summary:
In this lab, you learned how to create search templates, and use async searches.