Lab 6.1: Understanding shards

Objective:

In this lab, you will become familiar with shard distribution. You will create new indices with a specific number of shards and replicas and analyze shard allocation.

  1. Run the following _cat command, which returns information about the nodes in the cluster:

    GET _cat/nodes?v
    
    How many nodes do you have in your cluster?

    Solution

    You have three nodes.

  2. Run the following _cat command, which returns information about the indices in the cluster:

    GET _cat/indices?v
    

  3. Notice the indices command shows the number of primary shards, replica shards, documents and deleted documents. You also get the size on disk for primary shards and total (primary shards + replica shards). Finally, notice that every index has a name and a UUID.

  4. Run the following _cat command, which returns information about the shards in the cluster like the number of documents, the size on disk, and which node a shard has been allocated. How many shards does the index web_traffic have, and in which node are they allocated?

    GET _cat/shards?v
    

  5. If you are interested in a specific index, add the index name after /shards in the _cat command. View the shard details for just the web_traffic index.

    Solution
    GET _cat/shards/web_traffic?v
    
  6. To better understand shard allocation, create a new index named test1 with two primary shards and three replicas. How many shards will be needed for test1 on the cluster?

    Solution

    PUT test1
    {
      "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 3
      }
    }
    
    The cluster will need eight shards: two primary and six replicas.

  7. View the shard allocation of your test1 index. How many primary and replica shards were started on the cluster? Why are there two unassigned shards?

    Solution

    You should see 2 STARTED primary shards, 4 STARTED replica shards, and 2 UNASSIGNED replica shards. You have a 3-node cluster, which provides nodes for one primary and two replicas. Any additional replicas will have to be unassigned until more nodes are added to the cluster.

    GET _cat/shards/test1?v
    

  8. Run the following command, which returns the same shard details but sorts them by shard 0 and shard 1:

    GET _cat/shards/test1?v&s=shard,prirep
    

  9. Update the test1 index to have two replicas (instead of 3).

    Solution
    PUT test1/_settings
    {
        "number_of_replicas": 2
    }
    
  10. Check the shard allocation using the following command, which sorts the results by node, then shard:

    GET _cat/shards/test1?v&s=node,shard
    

  11. You are done experimenting with shards, so you can go ahead and delete the test1 index:

    DELETE test1
    

Summary:

In this lab, you became familiar with shard distribution. You created a new index with a specific number of shards and replicas and analyzed shard allocation.