How to Auto-Scale Kinesis Data Streams on Kubernetes

Managing real-time data efficiently is critical for modern applications. Learning how to auto-scale Kinesis Data Streams consumer applications on Kubernetes can help reduce costs while improving performance. This guide provides a clear, step-by-step approach for building scalable and resilient data pipelines.

Kubernetes features, such as the Horizontal Pod Autoscaler combined with KEDA, enable applications to adjust dynamically to changing workloads. As a result, resource management becomes simpler and more cost-effective.

Auto-scale Kinesis Data Streams on Kubernetes using KEDA and KCL

What Are Amazon Kinesis and Kinesis Data Streams?

Amazon Kinesis is a platform designed for real-time data ingestion, processing, and analysis. Kinesis Data Streams is a serverless streaming service that scales elastically to handle varying workloads. Together with Kinesis Data Firehose, Kinesis Video Streams, and Kinesis Data Analytics, it powers real-time dashboards and analytics applications.

A data stream consists of multiple shards, each containing a sequence of records. Producers push data into these shards, and consumers process it in real-time. Records are grouped by partition keys to determine shard assignment. Consumer applications, often built with the Kinesis Client Library (KCL), handle record processing, checkpointing, and shard balancing automatically.

How to Auto-Scale Kinesis Data Streams Using KCL

The Kinesis Client Library ensures that every shard has a dedicated record processor. When more workers are added or shards are split, KCL redistributes shard assignments automatically. Consequently, horizontal scaling is straightforward.

However, manual scaling can be tedious. To address this, Kubernetes Event-Driven Autoscaling (KEDA) monitors workloads and adjusts pod counts based on the number of events needing processing.

Setting Up KEDA to Auto-Scale Kinesis Streams

KEDA is an open-source CNCF project built on native Kubernetes autoscaling features. It allows deployments to scale to zero or more pods depending on event-driven metrics. For Kinesis streams, KEDA observes shard counts and triggers scaling events.

Key KEDA components include:

keda-operator-metrics-apiserver – exposes metrics for the Horizontal Pod Autoscaler.
KEDA Scaler – connects to Kinesis to fetch shard-based metrics.
keda-operator – activates or deactivates deployments automatically.

This integration ensures that Kinesis consumers scale automatically according to the workload.

Prerequisites to Auto-Scale Kinesis Streams

Before proceeding, ensure you have the following:

AWS account with required permissions
AWS CLI, kubectl, Docker, Java 11, Maven installed
An Amazon EKS cluster, DynamoDB table, and Kinesis Data Stream

Create the EKS cluster with eksctl for convenience:

eksctl create cluster --name demo-cluster --region us-east-1

Then, set up DynamoDB and a Kinesis stream:

aws dynamodb create-table --table-name users --attribute-definitions AttributeName=email,AttributeType=S --key-schema AttributeName=email,KeyType=HASH --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

aws kinesis create-stream --stream-name kinesis-keda-demo --shard-count 2

Clone the sample KCL application repository:

git clone https://github.com/abhirockzz/kinesis-keda-autoscaling
cd kinesis-keda-autoscaling

Deploy and Configure KEDA to Auto-Scale Kinesis Streams

Install KEDA using YAML:

kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.8.2/keda-2.8.2.yaml

Verify installation:

kubectl get crd
kubectl get deployment -n keda
kubectl logs -f $(kubectl get pod -l=app=keda-operator -o jsonpath='{.items[0].metadata.name}' -n keda) -n keda

Configuring IAM Roles for Auto-Scale Kinesis Data Streams

Both KEDA and the KCL consumer application require AWS permissions via IAM Roles for Service Accounts (IRSA).

KEDA operator – access to DescribeStreamSummary API to monitor shards.
KCL application – permissions for Kinesis and DynamoDB operations.

After creating and attaching the roles, annotate the Kubernetes service accounts and restart the KEDA operator to apply changes.

Monitoring Auto-Scale Kinesis Data Streams in Action

Once deployed, define a ScaledObject for the KCL deployment. KEDA will automatically adjust the number of pods based on shard count. Increasing shards triggers scale-out, while reducing shards scales pods down.

You can verify processing by observing DynamoDB records and the processed_by attribute, which shows how KCL distributes workloads across pods.

Benefits of Using ZippyOPS to Auto-Scale Kinesis Data Streams

Organizations can streamline auto-scaling and operational complexity with ZippyOPS. We provide consulting, implementation, and managed services in DevOps, DevSecOps, DataOps, Cloud, Automated Ops, AIOps, MLOps, Microservices, Infrastructure, and Security.

Explore services
Learn about solutions
Check out products
Watch demo videos

With ZippyOPS, you can ensure enterprise-grade reliability for real-time data pipelines while keeping costs efficient.

Conclusion

In conclusion, learning to auto-scale Kinesis Data Streams on Kubernetes improves performance, efficiency, and cost management. KEDA and KCL work together to provide seamless scaling. By leveraging ZippyOPS-managed services, organizations can deploy robust, scalable streaming pipelines with professional guidance.

For expert support or implementation services, contact sales@zippyops.com today.