Lab 5.2: Combining aggregations

Objective:

In this lab, you will learn how to combine queries.

  1. In the previous lesson, you learned how to get the number of weekly requests.

    GET web_traffic/_search
    {
      "size": 0,
      "aggs": {
        "logs_by_week": {
          "date_histogram": {
            "field": "@timestamp",
            "calendar_interval": "week"
          }
        }
      }
    }
    

  2. Change this request to get the weekly requests that returned a 404 response.

    Solution
    GET web_traffic/_search
    {
      "size": 0,
      "query": {
        "term": {
          "http.response.status_code": {
            "value": "404"
          }
        }
      },
      "aggs": {
        "logs_by_week": {
          "date_histogram": {
            "field": "@timestamp",
            "calendar_interval": "week"
          }
        }
      }
    }
    
  3. In a single request, calculate both the median and the average of the runtime_ms field.

    Solution
    GET web_traffic/_search
    {
      "size": 0,
      "aggs": {
        "request_time_stats": {
          "stats": {
            "field": "runtime_ms"
          }
        },
        "median": {
          "percentiles": {
            "field": "runtime_ms",
            "percents": [
              50
            ]
          }
        }
      }
    }
    
  4. What is the median runtime for each response code?

    Solution
    GET web_traffic/_search
    {
      "size": 0,
      "aggs": {
        "status_code_buckets": {
          "terms": {
            "field": "http.response.status_code"
          },
          "aggs": {
            "runtime": {
              "percentiles": {
                "field": "runtime_ms",
                "percents": [
                  50
                ]
              }
            }
          }
        }
      }
    }
    
  5. Sort the result of the previous query to find the response code with the lowest runtime median.

    Solution

    The 503 response codes are the fastest ones.

    GET web_traffic/_search
    {
      "size": 0,
      "aggs": {
        "status_code_buckets": {
          "terms": {
            "field": "http.response.status_code",
            "order": {
              "runtime.50": "asc"
            }
          },
          "aggs": {
            "runtime": {
              "percentiles": {
                "field": "runtime_ms",
                "percents": [
                  50
                ]
              }
            }
          }
        }
      }
    }
    

  6. For each week, break down the number of requests by user agent's OS name. Use the field user_agent.os.name.keyword.

    Solution
    GET web_traffic/_search
    {
      "size": 0,
      "aggs": {
        "logs_by_week": {
          "date_histogram": {
              "field": "@timestamp",
              "calendar_interval": "week"
          },
          "aggs": {
            "verb": {
              "terms": {
                "field": "user_agent.os.name.keyword"
              }
            }
          }
        }
      }
    }
    
  7. What are the top 3 URLs accessed from each top 5 OS? Analyze the results closely and notice there is a common set of URLs for most os.

    Solution
    GET web_traffic/_search
    {
      "size": 0,
      "aggs": {
        "top_OS": {
          "terms": {
            "field": "user_agent.os.name.keyword",
            "size": 5
          },
          "aggs": {
            "top_urls": {
              "terms": {
                "field": "url.original",
                "size": 3
              }
            }
          }
        }
      }
    }
    

Summary:

In this lab, you combined an aggregation with a query to reduce the scope of the aggregation. You also combine multiple aggregations. Finally, you saw how to use sub-buckets to sort your aggregations.