Master Request Rate Limiting in Kubernetes with NGINX
In modern cloud environments, web applications often face surges in incoming traffic. However, unmanaged request volumes can overload servers, cause downtime, or even create security vulnerabilities. Because of this, request rate limiting in Kubernetes is essential to maintain performance, ensure stability, and protect your applications from malicious attacks such as DDoS or brute-force attempts.
In this guide, we will walk you through setting up request rate limiting using NGINX Ingress, testing it with Locust, and integrating best practices for real-world Kubernetes deployments.
Moreover, if you’re looking to optimize and automate these processes, ZippyOPS provides consulting, implementation, and managed services across DevOps, DevSecOps, DataOps, Cloud, Automated Ops, Microservices, Infrastructure, Security, AIOps, and MLOps.

Why Request Rate Limiting in Kubernetes Matters
Rate limiting controls the number of requests a server can process within a given time frame. Consequently, it:
- Prevents abuse and resource exhaustion
- Protects against DDoS attacks
- Ensures fair distribution of resources
- Maintains application reliability during traffic spikes
For example, Kubernetes clusters running multiple services can benefit from precise rate-limiting rules to avoid service slowdowns. According to NGINX documentation, proper configuration reduces both latency and risk.
Step 1: Deploy a Sample NGINX Application
Before implementing rate limits, deploy a simple NGINX application in Kubernetes. This serves as a test environment. Create a file named nginx-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
Apply the deployment:
kubectl apply -f nginx-deployment.yaml
The NGINX pod is now running and ready for exposure via a Kubernetes Service.
Step 2: Expose NGINX with a Kubernetes Service
To make NGINX accessible within the cluster, create a service. Save this as nginx-service.yaml:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
Apply the service:
kubectl apply -f nginx-service.yaml
At the same time, the service ensures internal load balancing for any replicas.
Step 3: Install NGINX Ingress Controller
NGINX Ingress Controller manages external traffic to Kubernetes. Using Helm, install it efficiently:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install my-nginx ingress-nginx/ingress-nginx
This controller will route requests to services like NGINX and enforce rate limits. For guidance on Helm installation, follow the official Helm documentation.
Step 4: Configure Request Rate Limiting via Ingress
Create an Ingress resource named rate-limit-ingress.yaml to define rate limits:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rate-limit-ingress
annotations:
nginx.ingress.kubernetes.io/limit-rps: "10"
nginx.ingress.kubernetes.io/limit-rpm: "100"
nginx.ingress.kubernetes.io/limit-rph: "1000"
nginx.ingress.kubernetes.io/limit-connections: "100"
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-service
port:
number: 80
Apply the Ingress:
kubectl apply -f rate-limit-ingress.yaml
As a result, NGINX Ingress will enforce the configured limits, controlling traffic per second, minute, hour, and connection count.
Step 5: Test Rate Limits with Locust
Locust is a load-testing tool that simulates multiple users accessing your service simultaneously. Install Locust locally:
pip install locust
You can also deploy Locust in Kubernetes for realistic cluster testing. Save this as locust-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: locust-deployment
spec:
replicas: 1
selector:
matchLabels:
app: locust
template:
metadata:
labels:
app: locust
spec:
containers:
- name: locust
image: locustio/locust:latest
command:
- locustargs:
- -f
- /locust-tasks/tasks.py
- --host
- http://nginx-service.default.svc.cluster.local
ports:
- containerPort: 8089
Deploy Locust:
kubectl apply -f locust-deployment.yaml
kubectl port-forward deployment/locust-deployment 8089:8089
Next, open a browser at localhost:8089, define your test scenario, set the target host to your NGINX Ingress, and monitor how rate limits are enforced.
Step 6: Optimize Kubernetes Operations
Implementing request rate limiting in Kubernetes is just one part of a stable cloud strategy. ZippyOPS supports businesses with consulting, implementation, and managed services across DevOps, DevSecOps, DataOps, Cloud, Automated Ops, Microservices, Infrastructure, Security, AIOps, and MLOps. Explore:
These services help teams implement rate limiting, enhance cloud performance, and maintain security at scale.
Conclusion for Request Rate Limiting in Kubernetes
In summary, request rate limiting in Kubernetes safeguards your applications against high traffic and malicious attacks while preserving performance. Using NGINX Ingress and Locust, you can configure, test, and monitor rate limits effectively.
At the same time, integrating best practices such as ZippyOPS consulting services ensures automation, security, and scalability for modern cloud-native applications.
For personalized guidance, contact sales@zippyops.com to discuss how ZippyOPS can optimize your Kubernetes and cloud infrastructure.



