Elasticsearch: Get Number of Shards Per Node Using Java Rest Client

Image by Allen_Henderson from Pixabay

The number of shards per node is one of the important things to monitor while monitoring the health of the Elasticsearch cluster. In this post, we will see how to get it programmatically.

Introduction

Elasticsearch stores documents in indices. Each index is made up of one or more shards. Each shard is an instance of Lucene and can be considered as a self-contained index itself with a subset of the data. There are two types of shards: primary and replica shards. Primary shards are the ones where writing happens and replica shards contribute to the data redundancy and handling search queries. 

There is no hard limit imposed by Elasticsearch on the number of shards per node. The number of shards on a node is proportional to the amount of heap memory. If you want to check the current heap memory you can use the following command:

GET _cat/nodes?v=true&h=heap.current

One of the good rules of thumb is to aim for 20 shards per GB of heap memory. So for example, if a node has 10 GB heap memory, then it should have a maximum of 200 shards. To check the shards per node we can use the shard API as shown below:

GET _cat/shards?v=true&format=json
// Result of the above command is as following:
[
  {
    "index": ".apm-custom-link",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "0",
    "store": "208b",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": "accounts",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "1000",
    "store": "382.3kb",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": "accounts",
    "shard": "0",
    "prirep": "r",
    "state": "UNASSIGNED",
    "docs": null,
    "store": null,
    "ip": null,
    "node": null
  },
  {
    "index": "posts",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "1",
    "store": "4.4kb",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": "posts",
    "shard": "0",
    "prirep": "r",
    "state": "UNASSIGNED",
    "docs": null,
    "store": null,
    "ip": null,
    "node": null
  },
  {
    "index": ".kibana_1",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "50",
    "store": "10.4mb",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": "accounts_test",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "5000",
    "store": "1.7mb",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": "accounts_test",
    "shard": "0",
    "prirep": "r",
    "state": "UNASSIGNED",
    "docs": null,
    "store": null,
    "ip": null,
    "node": null
  },
  {
    "index": "ilm-history-2-000001",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": null,
    "store": null,
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": ".kibana_task_manager_1",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "6",
    "store": "417.9kb",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": ".apm-agent-configuration",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "0",
    "store": "208b",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  },
  {
    "index": ".kibana-event-log-7.9.3-000001",
    "shard": "0",
    "prirep": "p",
    "state": "STARTED",
    "docs": "4",
    "store": "21.6kb",
    "ip": "172.17.0.2",
    "node": "618c33cb6a53"
  }
]

we can then count the number of shards per node.

Using High Level Rest Client for Shards Count

The following code shows how we can achieve the same using the Java High Level Rest client. Please note that we are using the low level rest client here to fetch the details about the shards as I could not find support in high level client. Please have a look into this for the completeness of high level client.

public Map<String, Integer> getNumberOfShardsPerNode() {
    try {
      final Response response = restHighLevelClient.getLowLevelClient()
          .performRequest(new Request("GET", "/_cat/shards?v&format=json"));
      final var string = EntityUtils.toString(response.getEntity());
      final var shardDetails = new ObjectMapper()
          .readValue(string, new TypeReference<List<NodeDetails>>() {
          });
      final var shardsPerNode = shardDetails.stream()
          .filter(nodeDetails -> !"UNASSIGNED".equalsIgnoreCase(nodeDetails.getState()))
          .collect(Collectors.groupingBy(NodeDetails::getNode, Collectors.summingInt(value -> 1)));
      shardsPerNode.forEach((node, shards) -> log.info("Node : {}, Shard count: {}", node, shards));
      return shardsPerNode;

    } catch (IOException ioException) {
      ioException.printStackTrace();
    }
    return Collections.emptyMap();
  }

The above code prints number of shards per node as below:

Node : 34_b6Yiq, Shard count: 360
Node : as0aDaPl_, Shard count: 360
Node : 32_wkiqAf5, Shard count: 360

Conclusion

In this short post, we saw how to calculate the number of shards per node in the Elasticsearch cluster. Happy learning 😊.

References

  • https://pixabay.com/illustrations/ice-shards-frozen-winter-cold-1194102/
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html
  • https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
  • https://github.com/elastic/elasticsearch/issues/27205

Comments