Lab 8.3: Optimizing search performance
Objective:
In this lab, you will examine ways to improve the performance of searches using the Profile API. You will also enhance the relevance of your queries.
-
Start by running a slow query in Console :
GET blogs_fixed2/_search { "profile": true, "_source": [""], "query": { "function_score": { "query": { "match_all": {} }, "script_score": { "script": """ void slow() { for (int x = 0; x < 10000; ++x) Math.log(x); } for (int x = 0; x < 3; ++x) slow(); """ } } } } -
The output of
profilewill be below thehitssection of the output. Notice it is somewhat complicated to read. Let's look at a clearer view with the Search Profiler. -
View the Search Profiler page in Kibana (click the tab to the right of Console in Dev Tools). Set the
indextoblogs_fixed2and set the body of the query to the following, then click the Profile button:{ "query": { "function_score": { "query": { "match_all": {} }, "script_score": { "script": """ void slow() { for (int x = 0; x < 10000; ++x) Math.log(x); } for (int x = 0; x < 3; ++x) slow(); """ } } } } -
Using the Search Profiler, see if you can determine the "slow" part of the query (which we already know is the long
forloop.) -
Let's profile another search - this time one that combines aggregations and queries. Set the
indextoblogs_fixed2and set the body of the query to the following:{ "query": { "bool": { "must": [ {"match": { "title": "logstash" }} ], "must_not": [ {"match": { "search_tags": "kibana" }} ], "should": [ {"match_phrase": { "content": "dead letter queue" }} ] } }, "aggs": { "author": { "terms": { "field": "authors.last_name" } } } } -
Each section of the search is broken down into the component tasks in the output. Which part of this search is taking the longest? How might this be improved?
Solution
Chances are, it's the
dead letter queuephrase query. Since it's in ashouldclause, we could replace it withmatchand probably get similar results. Recall that a "should" only impacts the score and, therefore, the order of the results, not which results are actually returned. Try a change and see! -
Next, run the following query that searches for the term boosting among the fields
titleandcontent:GET blogs_fixed2/_search { "_source": [ "title" ], "query": { "multi_match": { "query": "boosting", "fields": [ "content", "title" ] } } } -
Analyze the results closely. You should notice that the blogs with the term boosting in their title don't always appear first.
-
Update the previous query to give the
titlefield a higher weight (1.4).Solution
The three blogs with the term boosting in their title should rank higher.GET blogs_fixed2/_search { "_source": [ "title" ], "query": { "multi_match": { "query": "boosting", "fields": [ "content", "title^1.4" ] } } } -
EXAM PREP: By default, Elasticsearch uses the maximum score from the two field to compute the final score. Update the previous query to use the sum of the field scores instead of using the default
best_fields.Solution
GET blogs_fixed2/_search { "_source": [ "title" ], "query": { "multi_match": { "type": "most_fields", "query": "boosting", "fields": [ "content", "title^1.4" ] } } }
Summary:
In this lab, you saw how to profile inefficient searches to identify where to improve the search efficiency. You also learned how to improve the relevance of your queries.