Autoscaling volumes for Kubernetes (with the help of Prometheus)

DevOps Nirvana

Last update: Dec 28, 2022

Related tags

DevOps Tools kubernetes aws devops automation ebs-volumes cloud-native volumes autoscaling persistentvolume persistentvolumeclaim devops-nirvana

Overview

Kubernetes Volume Autoscaler (with Prometheus)

This repository contains a service that automatically increases the size of a Persistent Volume Claim in Kubernetes when its nearing full. Initially engineered based on AWS EKS, this should support any Kubernetes cluster or cloud provider which supports dynamically resizing storage volumes in Kubernetes.

Keeping your volumes at a minimal size can help reduce cost, but having to manually scale them up can be painful and a waste of time for an DevOps / Systems Administrator.

Requirements

Kubernetes 1.17+ Cluster
kubectl binary installed and setup with your cluster
The helm 3.0+ binary
Prometheus installed on your cluster Example 1 / Example 2 (old)
Using an Storage Class with allowVolumeExpansion == true
Using an Volume provisioner which supports dynamic volume expansion
- EKS default driver on 1.17+ does
- AWS EBS CSI driver also does

Prerequisites

As mentioned above, you must have a storageclass which supports volume expansion, and the provisioner you're using must also support volume expansion. Ideally, "hot"-volume expansion so your services never have to restart. AWS EKS built-in provisioner kubernetes.io/aws-ebs supports this, and so does the efs.csi.aws.com CSI driver. To check/enable this...

# First, check if your storage class supports volume expansion...
$ kubectl get storageclasses
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
standard               kubernetes.io/aws-ebs   Delete          Immediate              false                  10d

# If ALLOWVOLUMEEXPANSION is not set to true, patch it to enable this
kubectl patch storageclass standard -p '{"allowVolumeExpansion": true}'

NOTE: The above storageclass comes with EKS, however, it only supports gp2, which is largely a deprecated and much slower storage driver than gp3. I HIHGLY recommend before using EKS you install the AWS EBS CSI driver to gain gp3 support and more future-proof support of Amazon's various storage volumes and their lifecycles.

If you do this, you can/should completely remove GP2 support, and after installing the above CSI driver, create a storageclass with the new driver with best-practices in it by default including...

Retain-ing the volume if it was deleted (to prevent accidental data loss)
Having all disks encrypted-at-rest by default, for compliance/security
Using gp3 by default for faster disk bandwidth and IO

# For this, simply delete your old default StorageClass
kubectl delete storageclass standard
# Then apply/create a new default gp3 using the AWS EBS CSI driver you installed
kubectl apply -f https://raw.githubusercontent.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/master/examples/gp3-default-encrypt-retain-allowExpansion-storageclass.yaml

Installation with Helm

Now that your cluster has a StorageClass which supports expansion, you can install the Volume Autoscaler

# First, setup this repo for your helm
helm repo add devops-nirvana https://devops-nirvana.s3.amazonaws.com/helm-charts/

# Example Install 1 - Using autodiscovery, must be in the same namespace as Prometheus
helm upgrade --install volume-autoscaler devops-nirvana/volume-autoscaler \
  --namespace REPLACEME_WITH_PROMETHEUS_NAMESPACE

# Example 2 - Manually setting where Prometheus is
helm upgrade --install volume-autoscaler devops-nirvana/volume-autoscaler \
  --namespace ANYWHERE_DOESNT_MATTER \
  --set "prometheus_url=http://prometheus-server.namespace.svc.cluster.local"

# Example 3 - Recommended usage, automatically detect Prometheus and use slack notifications
helm upgrade --install volume-autoscaler devops-nirvana/volume-autoscaler \
  --namespace REPLACEME_WITH_PROMETHEUS_NAMESPACE \
  --set "slack_webhook_url=https://hooks.slack.com/services/123123123/4564564564/789789789789789789" \
  --set "slack_channel=my-slack-channel-name"

Advanced helm usage...

# To update your local knowledge of remote repos, you may need to do this before upgrading...
helm repo update

# To view what changes it will make, if you change things, this requires the helm diff plugin - https://github.com/databus23/helm-diff
helm diff upgrade volume-autoscaler --allow-unreleased devops-nirvana/volume-autoscaler \
  --namespace infrastructure \
  --set "slack_webhook_url=https://hooks.slack.com/services/123123123/4564564564/789789789789789789" \
  --set "slack_channel=my-slack-channel-name" \
  --set "prometheus_url=http://prometheus-server.infrastructure.svc.cluster.local"

# To remove the service, simply run...
helm uninstall volume-autoscaler

(Alternate) Installation with `kubectl`

./to_be_applied.yaml # #3: If you wish to have slack notifications, edit this to_be_applied.yaml and embed your webhook on the value: line for SLACK_WEBHOOK and set the SLACK_CHANNEL as well accordingly # #4: Finally, apply it... kubectl --namespace REPLACEME_WITH_PROMETHEUS_NAMESPACE apply ./to_be_applied.yaml">

# This simple installation will work as long as you put this in the same namespace as Prometheus
# The default namespace of this yaml is hardcoded to is `infrastructure`.  If you'd like to change
# the namespace you can run the first few commands below...

# IF YOU USE `infrastructure` AS THE NAMESPACE FOR PROMETHEUS SIMPLY...
kubectl --namespace infrastructure apply https://devops-nirvana.s3.amazonaws.com/volume-autoscaler/volume-autoscaler-1.0.1.yaml

# OR, IF YOU NEED TO CHANGE THE NAMESPACE...
# #1: Download the yaml...
wget https://devops-nirvana.s3.amazonaws.com/volume-autoscaler/volume-autoscaler-1.0.1.yaml
# #1: Or download with curl
curl https://devops-nirvana.s3.amazonaws.com/volume-autoscaler/volume-autoscaler-1.0.1.yaml -o volume-autoscaler-1.0.1.yaml
# #2: Then replace the namespace in this, replacing
cat volume-autoscaler-1.0.1.yaml | sed 's/"infrastructure"/"PROMETHEUS_NAMESPACE_HERE"/g' > ./to_be_applied.yaml
# #3: If you wish to have slack notifications, edit this to_be_applied.yaml and embed your webhook on the value: line for SLACK_WEBHOOK and set the SLACK_CHANNEL as well accordingly
# #4: Finally, apply it...
kubectl --namespace REPLACEME_WITH_PROMETHEUS_NAMESPACE apply ./to_be_applied.yaml

Validation

To confirm the volume autoscaler is working properly this repo has an example which you can apply to your Kubernetes cluster which is an PVC and a pod which uses that PVC and fills the disk up constantly. To do this...

# Simply run this on your terminal
kubectl apply -f https://raw.githubusercontent.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/master/examples/simple-pod-with-pvc.yaml

Then if you'd like to follow-along, "follow" the logs of your volume autoscaler to watch it detect full disk and scale up.

Per-Volume Configuration / Annotations

This controller also supports tweaking your volume-autoscaler configuration per-PVC with annotations. The annotations supported are...

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sample-volume-claim
  annotations:
    # This is when we want to scale up after the disk is this percentage (out of 100) full
    volume.autoscaler.kubernetes.io/scale-above-percent: "80"   # 80 is the default value
    # This is how many intervals must go by above the scale-above-percent before triggering an autoscale action
    volume.autoscaler.kubernetes.io/scale-after-intervals: "5"  # 5 is this default value
    # This is how much to scale a disk up by, in percentage of the current size.
    #   Eg: If this is set to "10" and the disk is 100GB, it will scale to 110GB
    #   At larger disk sizes you may want to set this on your PVCs to like "5" or "10"
    volume.autoscaler.kubernetes.io/scale-up-percent: "50"      # 50 (percent) is the default value
    # This is the smallest increment to scale up by.  This helps when the disks are very small, and helps hit the minimum increment value per-provider (this is 1GB on AWS)
    volume.autoscaler.kubernetes.io/scale-up-min-increment: "1000000000"  # 1GB by default (in bytes)
    # This is the largest disk size ever allowed for this tool to scale up to.  This is set to 16TB by default, because that's the limit of AWS EBS
    volume.autoscaler.kubernetes.io/scale-up-max-size: "16000000000000"  # 16TB by default (in bytes)
    # How long (in seconds) we must wait before scaling this volume again.  For AWS EBS, this is 6 hours which is 21600 seconds but for good measure we add an extra 10 minutes to this, so 22200
    volume.autoscaler.kubernetes.io/scale-cooldown-time: "22200"  
    # If you want the autoscaler to completely ignore/skip this PVC, set this to "true"
    volume.autoscaler.kubernetes.io/ignore: "false"  
    # Finally, Do not set this, and if you see this ignore this, this is how Volume Autoscaler keeps its "state"
    volume.autoscaler.kubernetes.io/last-resized-at: "123123123"  # This will be an Unix epoch timestamp
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: standard

TODO

This todo list is mostly for the Author(s), but any contributions are also welcome. Please submit an Issue for issues or requests, or an Pull Request if you added some code.

Make helm chart able to customize the prometheus label selector
Add scale up max increment
Make log have more full (simplified) data about disks (max size, usage, etc, for debugging purposes)
Add dry-run as top-level arg to easily adjust, add to examples on this README
Push to helm repo in a Github Action and push the static yaml as well
Add tests coverage to ensure the software works as intended moving forward
Do some load testing to see how well this software deals with scale (100+ PVs, 500+ PVs, etc)
Figure out what type of Memory/CPU is necessary for 500+ PVs, see above
Add verbosity levels for print statements, to be able to quiet things down in the logs
Generate kubernetes EVENTS (add to rbac) so everyone knows we are doing things, to be a good controller
Add badges to the README
Listen/watch to events of the PV/PVC to monitor and ensure the resizing happens, log and/or slack it accordingly
Test it and add working examples of using this on other cloud providers (Azure / Google Cloud)
Make per-PVC annotations to (re)direct Slack to different webhooks and/or different channel(s)
Discuss what the ideal "default" amount of time before scaling. Currently is 5 minutes (5, 60 minute intervals)

A Simple script to hunt unused Kubernetes resources.

K8SPurger A Simple script to hunt unused Kubernetes resources. Release History Release 0.3 Added Ingress Added Services Account Adding RoleBindding Re

202 Nov 19, 2022

Run Oracle on Kubernetes with El Carro

El Carro is a new project that offers a way to run Oracle databases in Kubernetes as a portable, open source, community driven, no vendor lock-in container orchestration system. El Carro provides a powerful declarative API for comprehensive and consistent configuration and deployment as well as for real-time operations and monitoring.

205 Dec 30, 2022

Chartreuse: Automated Alembic migrations within kubernetes

Chartreuse: Automated Alembic SQL schema migrations within kubernetes "How to automate management of Alembic database schema migration at scale using

8 Oct 25, 2022

sysctl/sysfs settings on a fly for Kubernetes Cluster. No restarts are required for clusters and nodes.

SysBindings Daemon Little toolkit for control the sysctl/sysfs bindings on Kubernetes Cluster on the fly and without unnecessary restarts of cluster o

19 May 6, 2022

Caboto, the Kubernetes semantic analysis tool

Caboto Caboto, the Kubernetes semantic analysis toolkit. It contains a lightweight Python library for semantic analysis of plain Kubernetes manifests

8 Nov 26, 2022

Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

2.4k Jan 4, 2023

Rancher Kubernetes API compatible with RKE, RKE2 and maybe others?

kctl Rancher Kubernetes API compatible with RKE, RKE2 and maybe others? Documentation is WIP. Quickstart pip install --upgrade kctl Usage from lazycls

1 Dec 2, 2021

A charmed operator for running PGbouncer on kubernetes.

operator-template Description TODO: Describe your charm in a few paragraphs of Markdown Usage TODO: Provide high-level usage, such as required config

1 Dec 1, 2022

Quick & dirty controller to schedule Kubernetes Jobs later (once)

K8s Jobber Operator Quickly implemented Kubernetes controller to enable scheduling of Jobs at a later time. Usage: To schedule a Job later, Set .spec.

2 Feb 11, 2022

Comments

Autoscaling size below current size and PVC size not human readable.

Sometimes, the autoscaler tries to resize a PVC with a size below current size, raising an error.

Volume infra.data-nfs-server-provisioner-1637948923-0 is 85% in-use of the 80Gi available
  BECAUSE it is above 80% used
  ALERT has been for 1306 period(s) which needs to at least 5 period(s) to scale
  AND we need to scale it immediately, it has never been scaled previously
  RESIZING disk from 86G to 20G
  Exception raised while trying to scale up PVC infra.data-nfs-server-provisioner-1637948923-0 to 20000000000 ...
(422)
Reason: Unprocessable Entity
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'e69b53c3-d332-4925-b9ea-afa7570297a9', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'b64e47c9-2a4e-48ae-83bc-355685b6c007', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'e5841496-62d0-426a-a987-4b26ec143a20', 'Date': 'Sat, 22 Oct 2022 16:58:07 GMT', 'Content-Length': '520'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"PersistentVolumeClaim \"data-nfs-server-provisioner-1637948923-0\" is invalid: spec.resources.requests.storage: Forbidden: field can not be less than previous value","reason":"Invalid","details":{"name":"data-nfs-server-provisioner-1637948923-0","kind":"PersistentVolumeClaim","causes":[{"reason":"FieldValueForbidden","message":"Forbidden: field can not be less than previous value","field":"spec.resources.requests.storage"}]},"code":422}


FAILED requesting to scale up `infra.data-nfs-server-provisioner-1637948923-0` by `10%` from `86G` to `20G`, it was using more than `80%` disk space over the last `78360 seconds`

I'm using the helm chart version 1.0.3 (same image tag)

Another issue, the autoscaler was able to resize another PVC from 13Gi to 14173392076, this is not human readable as before. It's not a serious issue but it's still disturbing. The autoscaler also sent the alert to slack twice for this PVC with several hours interval.

opened by GuillaumeOuint 9

Customer-reported issue: Is not detecting updated/resized max size

There appears to be a bug in Prometheus Server which causes the kubelet_volume_stats_capacity_bytes to not be updated properly in Prometheus after a resize. Note: May need to go file a bug against the metrics-server or Prometheus.

After further investigation, it appears the prometheus metrics of kube_persistentvolume_capacity_bytes which is tied to the "PV" and not the "PVC" is fully updated, and we could (in theory) instead look there for the updated value but I believe this to be a bug which should be fixed in Prometheus.

opened by AndrewFarley 3
Handling low max edge-case better, human-readable debug output
Features

Updating various debug output to be human-readable, since bytes is just really, really long with a lot of zeroes, not ideal or reasonably human parsable

Catching an edge case where a user puts a max disk size of too small, and a disk can't scale up any more

Closes #3 ( thanks for finding & reporting this @GuillaumeOuint )
opened by AndrewFarley 0

Releases(1.0.5)

1.0.5(Oct 25, 2022)
What's Changed

Handling low max disk size edge-case better, see: #4

Human-readable debug output much improved, see: #4

Full Changelog: https://github.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/compare/1.0.4...1.0.5
Source code(tar.gz)
Source code(zip)
volume-autoscaler-helm-chart-1.0.5.tgz(11.23 KB)
volume-autoscaler-kubectl-infrastructure-namespace-1.0.5.yaml(12.08 KB)
1.0.4(Oct 24, 2022)
What's Changed

Upgrade python to 3.9.15 by @pblgomez in https://github.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/pull/2

Generate informational Prometheus metrics (version number and settings)

Generate usage & health Prometheus metrics (number of pvcs, number of resizes, etc)

Updating upstream universal helm chart

Scaled down default resize percentage to 20% (down from 50%) based on feedback

New Contributors

@pblgomez made their first contribution in https://github.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/pull/2

Full Changelog: https://github.com/DevOps-Nirvana/Kubernetes-Volume-Autoscaler/compare/1.0.3...1.0.4
Source code(tar.gz)
Source code(zip)
volume-autoscaler-helm-chart-1.0.4.tgz(11.23 KB)
volume-autoscaler-kubectl-infrastructure-namespace-1.0.4.yaml(12.08 KB)
1.0.3(Jan 24, 2022)
Handle signal from Kubernetes to kill/restart properly/quickly

Add full env vars as documentation markdown table, inside notes for development section below

Adding better exception logs via traceback, and more readable/reasonable log output especially when VERBOSE is enabled

Generate Kubernetes events so everyone viewing the event stream knows when actions occur

AKA, Be an responsible controller

Source code(tar.gz)
Source code(zip)
volume-autoscaler-helm-chart-1.0.3.tgz(9.99 KB)
volume-autoscaler-kubectl-infrastructure-namespace-1.0.3.yaml(8.82 KB)
1.0.2(Jan 15, 2022)
Automatically detecting version of Prometheus and using newer functions to de-bounce invalid PVCs automatically

Adding max-increment annotation/variable support

Adding exception handling in our main loop to handle jitter nicely and not fail catastrophically if someone has bad PVC annotations

Making all variables settable by a value in the helm chart

Adding verbose support, which when enabled prints out full data from the objects detected, and prints out even non-alerting disks

Printing the number of PVCs found in the log, useful when not in verbose mode

Source code(tar.gz)
Source code(zip)
volume-autoscaler-helm-chart-1.0.2.tgz(10.01 KB)
volume-autoscaler-kubectl-infrastructure-namespace-1.0.2.yaml(8.82 KB)
1.0.1(Jan 4, 2022)

Initial public release, helm chart and kubectl raw yaml published. Examples available on the README. Future todos listed on README as well.
Source code(tar.gz)
Source code(zip)
volume-autoscaler-helm-chart-1.0.1.tgz(8.84 KB)
volume-autoscaler-kubectl-infrastructure-namespace-1.0.1.yaml(8.01 KB)

Owner

DevOps Nirvana

What happens when you set everything up perfectly? Nirvana happens

GitHub

MagTape is a Policy-as-Code tool for Kubernetes that allows for evaluating Kubernetes resources against a set of defined policies to inform and enforce best practice configurations.

MagTape is a Policy-as-Code tool for Kubernetes that allows for evaluating Kubernetes resources against a set of defined policies to inform and enforce best practice configurations. MagTape includes variable policy enforcement, notifications, and targeted metrics.

143 Dec 27, 2022

Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App

Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App This example provisions a Google Kubernetes Engine

2 Feb 9, 2022

Django-Kubernetes - Learn how to deploy a docker-based Django application into a Kubernetes cluster into production on DigitalOcean

Django & Kubernetes Learn how to deploy a production-ready Django application in

100 Jan 1, 2023

Helperpod - A CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster

Helperpod is a CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster.

2 Feb 5, 2022

Autoscaling volumes for Kubernetes (with the help of Prometheus)

Related tags

Overview

Kubernetes Volume Autoscaler (with Prometheus)

Requirements

Prerequisites

Installation with Helm

Advanced helm usage...

(Alternate) Installation with kubectl

Validation

Per-Volume Configuration / Annotations

TODO

You might also like...

A Simple script to hunt unused Kubernetes resources.

Run Oracle on Kubernetes with El Carro

Chartreuse: Automated Alembic migrations within kubernetes

sysctl/sysfs settings on a fly for Kubernetes Cluster. No restarts are required for clusters and nodes.

Caboto, the Kubernetes semantic analysis tool

Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Rancher Kubernetes API compatible with RKE, RKE2 and maybe others?

A charmed operator for running PGbouncer on kubernetes.

Quick & dirty controller to schedule Kubernetes Jobs later (once)

Comments

Autoscaling size below current size and PVC size not human readable.

Customer-reported issue: Is not detecting updated/resized max size

Handling low max edge-case better, human-readable debug output

Features

Releases(1.0.5)

1.0.5(Oct 25, 2022)

What's Changed

1.0.4(Oct 24, 2022)

What's Changed

New Contributors

1.0.3(Jan 24, 2022)

1.0.2(Jan 15, 2022)

1.0.1(Jan 4, 2022)

Owner

DevOps Nirvana

MagTape is a Policy-as-Code tool for Kubernetes that allows for evaluating Kubernetes resources against a set of defined policies to inform and enforce best practice configurations.

Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App

Django-Kubernetes - Learn how to deploy a docker-based Django application into a Kubernetes cluster into production on DigitalOcean

Helperpod - A CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster

Prometheus exporter for AWS Simple Queue Service (SQS)

Rundeck / Grafana / Prometheus / Rundeck Exporter integration demo

This repository contains code examples and documentation for learning how applications can be developed with Kubernetes

A Blazing fast Security Auditing tool for Kubernetes

Official Python client library for kubernetes

A Kubernetes operator that creates UptimeRobot monitors for your ingresses

(Alternate) Installation with `kubectl`