Global rate limiting of L4 connections

This guide demonstrates how to configure global rate limiting for L4 TCP connections destined to a target host that is a part of an OSM managed service mesh.

Prerequisites

Kubernetes cluster running Kubernetes v1.22.9 or greater.
Have OSM installed.
Have kubectl available to interact with the API server.
Have osm CLI available for managing the service mesh.
OSM version >= v1.3.0.

Demo

The following demo shows a client fortio-client sending TCP traffic to the fortio TCP echo service. The fortio service echoes TCP messages back to the client. We will see the impact of applying global TCP rate limiting policies targeting the fortio service to control the throughput of traffic destined to the service backend using an external Rate Limit Service (RLS).

For simplicity, enable permissive traffic policy mode so that explicit SMI traffic access policies are not required for application connectivity within the mesh.

export osm_namespace=osm-system # Replace osm-system with the namespace where OSM is installed
kubectl patch meshconfig osm-mesh-config -n "$osm_namespace" -p '{"spec":{"traffic":{"enablePermissiveTrafficPolicyMode":true}}}'  --type=merge

Deploy the fortio TCP echo service in the demo namespace after enrolling its namespace to the mesh. The fortio TCP echo service runs on port 8078.

# Create the demo namespace
kubectl create namespace demo

# Add the namespace to the mesh
osm namespace add demo

# Deploy fortio TCP echo in the demo namespace
kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/main/manifests/samples/fortio/fortio.yaml -n demo

Confirm the fortio service pod is up and running.

$ kubectl get pods -n demo
NAME                            READY   STATUS    RESTARTS   AGE
fortio-c4bd7857f-7mm6w          2/2     Running   0          22m

Deploy the fortio-client app in the demo namespace. We will use this client to send TCP traffic to the fortio TCP echo service deployed previously.

kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/main/manifests/samples/fortio/fortio-client.yaml -n demo

Confirm the fortio-client pod is up and running.

NAME                            READY   STATUS    RESTARTS   AGE
fortio-client-b9b7bbfb8-prq7r   2/2     Running   0          7s

Deploy the external RLS. Any RLS that implement’s Envoy’s Rate Limit Service proto can be used as a global rate limiter with Envoy. For this demo, we use the Envoy Rate Limit Service as our RLS. Per the Envoy RLS overview:

The rate limit service is a Go/gRPC service designed to enable generic rate limit scenarios from different types of applications. Applications request a rate limit decision based on a domain and a set of descriptors. The service reads the configuration from disk via runtime, composes a cache key, and talks to the Redis cache. A decision is then returned to the caller.

Create a namespace to deploy the RLS into.

kubectl create namespace rls

Create a ConfigMap that contains the rate limit configuration. In this demo, we will rate limit traffic that generates the descriptor key-value pair my_key: my_value to 1 request per minute.

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
  namespace: rls
data:
  ratelimit-config.yaml: |
    domain: test
    descriptors:
      # requests with a descriptor of ["my_key": "my_value"]
      # are limited to one per minute.
      - key: my_key
        value: my_value
        rate_limit:
          unit: minute
          requests_per_unit: 1
EOF

Deploy the global RLS Service and Deployment in the rls namespace. The ratelimiter Deployment mounts the ratelimit-config ConfigMap as a volume, which allows the RLS service to load the rate limit configuration specified in the ConfigMap. We will later reference the ratelimiter service to configure global rate limiting within the mesh.

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: ratelimiter
  name: ratelimiter
  namespace: rls
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ratelimiter
  template:
    metadata:
      labels:
        app: ratelimiter
    spec:
      containers:
        - name: redis
          image: redis:alpine
          env:
            - name: REDIS_SOCKET_TYPE
              value: tcp
            - name: REDIS_URL
              value: redis:6379
        - name: ratelimiter
          image: docker.io/envoyproxy/ratelimit:1f4ea68e
          ports:
            - containerPort: 8080
              name: http
              protocol: TCP
            - containerPort: 8081
              name: grpc
              protocol: TCP
          volumeMounts:
            - name: ratelimit-config
              mountPath: /data/ratelimit/config
              readOnly: true
          env:
            - name: USE_STATSD
              value: "false"
            - name: LOG_LEVEL
              value: debug
            - name: REDIS_SOCKET_TYPE
              value: tcp
            - name: REDIS_URL
              value: localhost:6379
            - name: RUNTIME_ROOT
              value: /data
            - name: RUNTIME_SUBDIRECTORY
              value: ratelimit
            - name: RUNTIME_WATCH_ROOT
              value: "false"
            # need to set RUNTIME_IGNOREDOTFILES to true to avoid issues with
            # how Kubernetes mounts configmaps into pods.
            - name: RUNTIME_IGNOREDOTFILES
              value: "true"
          command: ["/bin/ratelimit"]
          livenessProbe:
            httpGet:
              path: /healthcheck
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
      volumes:
        - name: ratelimit-config
          configMap:
            name: ratelimit-config
---
apiVersion: v1
kind: Service
metadata:
  name: ratelimiter
  namespace: rls
spec:
  ports:
  - port: 8081
    name: grpc
    protocol: TCP
  selector:
    app: ratelimiter
  type: ClusterIP
EOF

Confirm the RLS pod is up and running.

$ kubectl get pods -n rls
NAME                          READY   STATUS    RESTARTS   AGE
ratelimiter-bb7665d55-6qtvv   2/2     Running   0          15s

Confirm the fortio-client app is able to successfully make TCP connections and send data to the fortio TCP echo service on port 8078. We call the fortio service with 3 concurrent connections (-c 3) and send 10 calls (-n 10).

$ fortio_client="$(kubectl get pod -n demo -l app=fortio-client -o jsonpath='{.items[0].metadata.name}')"

$ kubectl exec "$fortio_client" -n demo -c fortio-client -- fortio load -qps -1 -c 3 -n 10 tcp://fortio.demo.svc.cluster.local:8078
Fortio 1.32.3 running at -1 queries per second, 8->8 procs, for 10 calls: tcp://fortio.demo.svc.cluster.local:8078
20:41:47 I tcprunner.go:238> Starting tcp test for tcp://fortio.demo.svc.cluster.local:8078 with 3 threads at -1.0 qps
Starting at max qps with 3 thread(s) [gomax 8] for exactly 10 calls (3 per thread + 1)
20:41:47 I periodic.go:723> T001 ended after 34.0563ms : 3 calls. qps=88.0894283876992
20:41:47 I periodic.go:723> T000 ended after 35.3117ms : 4 calls. qps=113.2769025563765
20:41:47 I periodic.go:723> T002 ended after 44.0273ms : 3 calls. qps=68.13954069406937
Ended after 44.2097ms : 10 calls. qps=226.19
Aggregated Function Time : count 10 avg 0.01096615 +/- 0.01386 min 0.001588 max 0.0386716 sum 0.1096615
# range, mid point, percentile, count
>= 0.001588 <= 0.002 , 0.001794 , 40.00, 4
> 0.002 <= 0.003 , 0.0025 , 60.00, 2
> 0.003 <= 0.004 , 0.0035 , 70.00, 1
> 0.025 <= 0.03 , 0.0275 , 90.00, 2
> 0.035 <= 0.0386716 , 0.0368358 , 100.00, 1
# target 50% 0.0025
# target 75% 0.02625
# target 90% 0.03
# target 99% 0.0383044
# target 99.9% 0.0386349
Error cases : no data
Sockets used: 3 (for perfect no error run, would be 3)
Total Bytes sent: 240, received: 240
tcp OK : 10 (100.0 %)
All done 10 calls (plus 0 warmup) 10.966 ms avg, 226.2 qps

As seen above, all the TCP connections from the fortio-client pod succeeded.

Total Bytes sent: 240, received: 240
tcp OK : 10 (100.0 %)
All done 10 calls (plus 0 warmup) 10.966 ms avg, 226.2 qps

Next, apply a global rate limiting policy to rate limit L4 TCP connections to the fortio.demo.svc.cluster.local service that generate the descriptor entry my_key: my_value for the test domain via the global ratelimiter.rls.svc.cluster.local RLS service listening on port 8081. The descriptor entries and domain correspond to the rate limit configuration specified in the ratelimit-config ConfigMap previously. This has the utlimate effect of rate limiting TCP connections inbound on the fortio service to 1 per minute by generating the descriptor entry that matches the one specified in the ratelimit-config ConfigMap.

kubectl apply -f - <<EOF
apiVersion: policy.openservicemesh.io/v1alpha1
kind: UpstreamTrafficSetting
metadata:
  name: fortio-tcp
  namespace: demo
spec:
  host: fortio.demo.svc.cluster.local
  rateLimit:
    global:
      tcp:
        rateLimitService:
          host: ratelimiter.rls.svc.cluster.local
          port: 8081
        domain: test
        failOpen: false
        timeout: 10s
        descriptors:
          - entries:
            - key: my_key
              value: my_value
EOF

The above configuration will result in connections to the fortio backend service being forwarded to the RLS service for making a rate limit decision based on the descriptor entry my_key: my_value.

Confirm no traffic has been rate limited yet by examining the stats on the fortio backend pod.

$ fortio_server="$(kubectl get pod -n demo -l app=fortio -o jsonpath='{.items[0].metadata.name}')"

$ osm proxy get stats "$fortio_server" -n demo | grep ratelimit.*fortio.*8078
ratelimit.inbound_demo/fortio_8078_tcp.active: 0
ratelimit.inbound_demo/fortio_8078_tcp.cx_closed: 0
ratelimit.inbound_demo/fortio_8078_tcp.error: 0
ratelimit.inbound_demo/fortio_8078_tcp.failure_mode_allowed: 0
ratelimit.inbound_demo/fortio_8078_tcp.ok: 0
ratelimit.inbound_demo/fortio_8078_tcp.over_limit: 0
ratelimit.inbound_demo/fortio_8078_tcp.total: 0

Confirm TCP connections are rate limited.

$ kubectl exec "$fortio_client" -n demo -c fortio-client -- fortio load -qps -1 -c 3 -n 10 tcp://fortio.demo.svc.cluster.local:8078
Fortio 1.34.1 running at -1 queries per second, 8->8 procs, for 10 calls: tcp://fortio.demo.svc.cluster.local:8078
20:07:50 I tcprunner.go:238> Starting tcp test for tcp://fortio.demo.svc.cluster.local:8078 with 3 threads at -1.0 qps
Starting at max qps with 3 thread(s) [gomax 8] for exactly 10 calls (3 per thread + 1)
20:07:50 E tcprunner.go:203> [2] Unable to read: EOF
20:07:50 E tcprunner.go:203> [1] Unable to read: EOF
20:07:50 E tcprunner.go:203> [2] Unable to read: EOF
20:07:51 I periodic.go:721> T000 ended after 152.8903ms : 4 calls. qps=26.162549226471526
20:07:51 E tcprunner.go:203> [1] Unable to read: EOF
20:07:51 E tcprunner.go:203> [2] Unable to read: EOF
20:07:51 I periodic.go:721> T002 ended after 155.9383ms : 3 calls. qps=19.23837825601536
20:07:51 E tcprunner.go:203> [1] Unable to read: EOF
20:07:51 I periodic.go:721> T001 ended after 162.9256ms : 3 calls. qps=18.41331257948413
Ended after 162.9618ms : 10 calls. qps=61.364
Aggregated Function Time : count 10 avg 0.04707677 +/- 0.0595 min 0.0008829 max 0.1400948 sum 0.4707677
# range, mid point, percentile, count
>= 0.0008829 <= 0.001 , 0.00094145 , 10.00, 1
> 0.001 <= 0.002 , 0.0015 , 20.00, 1
> 0.009 <= 0.01 , 0.0095 , 40.00, 2
> 0.01 <= 0.011 , 0.0105 , 50.00, 1
> 0.012 <= 0.014 , 0.013 , 70.00, 2
> 0.12 <= 0.14 , 0.13 , 90.00, 2
> 0.14 <= 0.140095 , 0.140047 , 100.00, 1
# target 50% 0.011
# target 75% 0.125
# target 90% 0.14
# target 99% 0.140085
# target 99.9% 0.140094
Error cases : count 6 avg 0.053029183 +/- 0.05918 min 0.0095806 max 0.1400948 sum 0.3181751
# range, mid point, percentile, count
>= 0.0095806 <= 0.01 , 0.0097903 , 16.67, 1
> 0.01 <= 0.011 , 0.0105 , 33.33, 1
> 0.012 <= 0.014 , 0.013 , 66.67, 2
> 0.12 <= 0.14 , 0.13 , 83.33, 1
> 0.14 <= 0.140095 , 0.140047 , 100.00, 1
# target 50% 0.013
# target 75% 0.13
# target 90% 0.140038
# target 99% 0.140089
# target 99.9% 0.140094
Sockets used: 7 (for perfect no error run, would be 3)
Total Bytes sent: 240, received: 96
tcp OK : 4 (40.0 %)
tcp short read : 6 (60.0 %)
All done 10 calls (plus 0 warmup) 47.077 ms avg, 61.4 qps

As seen above, only 40% of the 10 calls succeeded, while the remaining 60% was rate limitied. This is because we applied a rate limiting policy of 1 connection per minute at the fortio backend service, and the fortio-client was able to use 1 connection to make 4/10 calls, resulting in a 40% success rate.

Examine the sidecar stats to further confirm this.

$ osm proxy get stats "$fortio_server" -n demo | grep ratelimit.*fortio.*8078
ratelimit.inbound_demo/fortio_8078_tcp.active: 0
ratelimit.inbound_demo/fortio_8078_tcp.cx_closed: 6
ratelimit.inbound_demo/fortio_8078_tcp.error: 0
ratelimit.inbound_demo/fortio_8078_tcp.failure_mode_allowed: 0
ratelimit.inbound_demo/fortio_8078_tcp.ok: 1
ratelimit.inbound_demo/fortio_8078_tcp.over_limit: 6
ratelimit.inbound_demo/fortio_8078_tcp.total: 7

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.