Kubernetes v1.36: Server-Side Sharded List and Watch for Massive Cluster Efficiency

By ● min read

As Kubernetes clusters scale to tens of thousands of nodes, controllers that watch high-cardinality resources like Pods face a significant scaling challenge. Each replica of a horizontally scaled controller receives the full event stream from the API server, consuming CPU, memory, and network bandwidth to deserialize every event, only to discard objects it doesn't own. Scaling out the controller actually increases the total cost. Kubernetes v1.36 addresses this with a new alpha feature: server-side sharded list and watch (KEP-5866). This feature moves filtering upstream to the API server, so each replica receives only the subset of events it is responsible for, dramatically reducing resource waste.

1. What is the core problem that server-side sharded list and watch solves?

Large Kubernetes clusters with many nodes often run controllers that need to watch high-cardinality resources like Pods. When these controllers are horizontally scaled (multiple replicas), each replica traditionally receives the full stream of events from the API server. This means every replica must deserialize and process every event, even though it might only be interested in a fraction of them. The result is massive waste: CPU cycles spent on deserialization, memory for temporary storage, and network bandwidth multiplied by the number of replicas. For example, a 10-replica controller in a 10,000-node cluster would process 10x the necessary data. Server-side sharded list and watch solves this by having the API server filter events at the source, sending only the events that belong to each replica's assigned shard, thus reducing per-replica workload and total cluster resource consumption.

Kubernetes v1.36: Server-Side Sharded List and Watch for Massive Cluster Efficiency

2. How does server-side sharding differ from client-side sharding?

Client-side sharding is already used by some controllers like kube-state-metrics. In that approach, each replica is assigned a portion of the keyspace and discards objects that don't belong to it after receiving the full event stream. While this works functionally, it does not reduce data flow from the API server. The API server still sends all events to every replica. The network bandwidth scales linearly with the number of replicas, not with shard size. CPU spent on deserialization is wasted for the discarded fraction. In contrast, server-side sharding moves the filtering logic into the API server itself. Each replica tells the API server which hash range it owns via a new shardSelector field in ListOptions. The API server computes a deterministic hash and sends only matching events. This cuts both network traffic and processing overhead, making scaling much more efficient.

3. How does the shardSelector work technically?

The feature introduces a shardSelector field in ListOptions. Clients specify a hash range using the shardRange() function, for example: shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000'). The API server computes a deterministic 64-bit FNV-1a hash of the specified field—currently supported fields are object.metadata.uid and object.metadata.namespace. It then returns only objects whose hash falls within the range [start, end). This applies to both list responses and watch event streams. Because the hash function produces the same result across all API server instances, the feature is safe to use with multiple API server replicas. The hash space is large (2^64), allowing fine-grained sharding. For a two-replica deployment, you might split the space into two equal halves: [0x0000…, 0x8000…) and [0x8000…, 0xFFFF…).

4. How can controllers implement server-side sharding using informers?

Controllers that use client-go informers can enable sharding by injecting the shardSelector into the ListOptions used by their informers via WithTweakListOptions. Here's an example in Go:

shardSelector := "shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000')"
factory := informers.NewSharedInformerFactoryWithOptions(client, resyncPeriod,
    informers.WithTweakListOptions(func(opts *metav1.ListOptions) {
        opts.ShardSelector = shardSelector
    }),
)

Each replica uses a unique shard range that covers a non-overlapping portion of the hash space. For a deployment with N replicas, you divide the 64-bit hash space into N equal segments and assign one to each replica. The informer will then list and watch only objects whose UID hash falls within that range. This ensures that each replica handles a distinct subset of resources without needing to manually filter events client-side.

5. What are the main benefits of using server-side sharded list and watch?

The primary benefit is reduced resource consumption across the cluster:

Network bandwidth: Each replica receives only its shard's events, so total network traffic equals the size of the full event stream (plus overhead for multiple connections), rather than N times the full stream.
CPU and memory: Deserialization and processing are performed only for events the replica actually cares about, eliminating wasted cycles.
Scalability: Adding more replicas no longer multiplies the load on the API server or the network; each new replica adds its own small slice of data.
Cost efficiency: In cloud environments, lower network and compute usage translates directly to reduced operational costs.

Additionally, the feature enables finer-grained horizontal scaling for controllers that previously hit scaling walls, allowing clusters to grow beyond tens of thousands of nodes with stable per-replica costs.

6. What fields can be used for sharding and is the hash deterministic?

Currently, only two field paths are supported: object.metadata.uid and object.metadata.namespace. The feature uses a deterministic 64-bit FNV-1a hash, which ensures that the same object always maps to the same hash value across all API server instances. This consistency is crucial for correctness—without it, different replica shards might miss events or receive duplicates. The deterministic nature also allows safe use with multiple API server replicas (common in production). The hash range is specified as two 64-bit hexadecimal strings (e.g., '0x0000000000000000' to '0x8000000000000000'). Future versions may support additional field paths like labels or annotations, and could allow custom hash functions or more flexible range specifications.

7. Is server-side sharding currently stable and how can I enable it?

In Kubernetes v1.36, server-side sharded list and watch is an alpha feature, meaning it is disabled by default and subject to change. To use it, you must enable the feature gate ServerSideShardedListWatch on the API server. Additionally, your controllers must be updated to use the shardSelector field in ListOptions. Since it's alpha, it's not recommended for production use yet, but it's perfect for testing and providing feedback. The Kubernetes community plans to gather experience from early adopters before promoting it to beta and eventually stable. If you run into issues or have suggestions, consider contributing to the KEP-5866 discussion.

Tags: