Lab 1.2: Data in

Objective:

In this lab, you will first index a document using the Elasticsearch API. Next, you will index the blogs published by Elastic on our website at www.elastic.co/blog. You will index blogs on cluster2 using Kibana's file upload mechanism. Finally, you will index blogs on cluster1 using the Bulk API.

  1. From Kibana's main menu, select Dev Tools to open Console if it is not already open.

  2. Using the Elasticsearch API, index a document that meets these requirements:

    • is indexed into an index called my_index
    • has an ID of 1
    • contains one field called my_field
    • has one value for the my_field field: Hello world!
    Solution

    PUT my_index/_doc/1
    {
      "my_field": "Hello world!"
    }
    
    Note that you do not need to create the index first. Elasticsearch will create the index for you if it does not already exist.

  3. Use the Elasticsearch Get by ID API to retrieve the document you have just indexed.

    Solution
    GET my_index/_doc/1
    
  4. Next you will see how you can use Kibana to upload data as files. Start by downloading this newblogs.json file to your local computer. It is a small text file that contains 7 recent blogs from Elastic.

  5. First, you will upload newblog.json into cluster2. Within the Strigo lab environment, click the Kibana2 button and log in. The username is training and the password is nonprodpwd.

  6. From Kibana's Home, click Upload a file. "Data Visualizer"

  7. Click Select or drag and drop a file: "Select a file"

  8. Select newblogs.json from your local computer. The file will be uploaded and analyzed by Kibana.

  9. Click Import.

  10. Enter blogs for the Index name and make sure the box is checked for Create data view.

  11. Click Import and the blogs should be indexed fairly quickly. Scroll down to the bottom of the page and click View index in Discover.

  12. Discover shows a date histogram of the blogs by automatically determining the first and last document. You should see 7 hits, from April 26-29, 2021: "The Discover app" You have successfully uploaded 7 documents into a new index named blogs into cluster2. Elastic has written a lot more than 7 blogs! Next, you will index the other blogs into cluster1 using a script that performs _bulk inserts.

  13. Click the Terminal button in the Strigo lab environment. "Click Terminal button"

  14. Change directories to the ~/datasets folder and view its contents:

    cd datasets
    ls -la
    

  15. Notice there is a JSON file named blogs.json which contains thousands of blogs published at www.elastic.co/blog. Each row in the text file is a single JSON document representing a single blog. Run the load_blogs.sh script to index the JSON file into cluster1:

    ./load_blogs.sh
    

  16. Earlier, when you uploaded a file in Kibana, it automatically created a blogs data view on cluster2. Now, because you're using the Elasticsearch API, you will need to define a data view yourself. In the Strigo lab environment, click Kibana1. From the main menu, select Stack Management (under Management). Next, select Data Views under Kibana.

  17. Click Create data view.

  18. Enter blogs for the Name (remove the asterisk) and select publish_date as the Timestamp field. "Data view creation"

  19. Click Create data view.

  20. Go to Discover and change the selected data view to blogs. You will not see any hits because you are only viewing the last 15 minutes of data, and the blogs are older than that.

  21. Change the time filter to view the last 13 years of blogs, and you should see 4,719 hits. "Change time filter to last 13 years"

  22. Click the little arrow to the left of the publish_date column of the first document and it will expand to display details about that document: "Open document in Discover"

  23. The table view of the expanded document is a great way to view the fields and values of the document. Notice you can also view the raw JSON of the document by selecting the JSON tab.

Summary:

In this lab, you used Kibana to upload a small sample of the blogs from the www.elastic.co/blog website. Next you used a script that used _bulk inserts to index the full dataset. Most of the blogs are on cluster1, but 7 of them are on cluster2. (Later in the course you will learn how to search both indices across the two clusters.) You also created a data view, which enables you to work with the documents in Kibana.