How-to: Persist Scheduler Jobs

Configure Scheduler to persist its database to make it resilient to restarts

The Scheduler service is responsible for writing jobs to its Etcd database and scheduling them for execution. On fresh Dapr v1.18+ installs, the Scheduler service database embeds Etcd and writes data to a Persistent Volume Claim volume of size 16Gi, using the cluster’s default storage class. Earlier versions defaulted to 1Gi, and clusters upgraded from those versions keep their original PVC size because spec.volumeClaimTemplates is immutable on an existing StatefulSet; the Helm chart detects the existing StatefulSet and pins storageSize to the value already in use. This means that there is no additional parameter required to run the scheduler service reliably on most Kubernetes deployments, although you will need additional configuration if a default StorageClass is not available or when running a production environment.

Production Setup

ETCD Storage Disk Size

The default storage size for the Scheduler is 16Gi on fresh Dapr v1.18+ installs, and 1Gi on earlier versions (and clusters upgraded from them). The legacy 1Gi is likely not sufficient for most production deployments, and even the new 16Gi default may need to be raised for higher-throughput workloads. When the storage size is exceeded, the Scheduler will log an error similar to the following:

error running scheduler: etcdserver: mvcc: database space exceeded

Knowing the safe upper bound for your storage size is not an exact science, and relies heavily on the number, persistence, and the data payload size of your application jobs. The Job API and Actor Reminders transparently maps one to one to the usage of your applications. Workflows create a large number of jobs as Actor Reminders, however these jobs are short lived- matching the lifecycle of each workflow execution. The data payload of jobs created by Workflows is typically empty or small.

The Scheduler uses Etcd as its storage backend database. By design, Etcd persists historical transactions and data in form of Write-Ahead Logs (WAL) and snapshots. This means the actual disk usage of Scheduler will be higher than the current observable database state, often by a number of multiples.

Setting the Storage Size on Installation

If you need to increase an existing Scheduler storage size, see the Increase Scheduler Storage Size section below. To set the storage size explicitly (in this example matching the 16Gi default) for a fresh Dapr installation, you can use the following command:

dapr init -k --set dapr_scheduler.cluster.storageSize=16Gi --set dapr_scheduler.etcdSpaceQuota=16Gi
helm upgrade --install dapr dapr/dapr \
--version=1.18 \
--namespace dapr-system \
--create-namespace \
--set dapr_scheduler.cluster.storageSize=16Gi \
--set dapr_scheduler.etcdSpaceQuota=16Gi \
--wait

Increase existing Scheduler Storage Size

On clusters upgraded from before Dapr v1.18, each Scheduler PVC is typically 1Gi (inherited from the earlier default) against the default standard storage class for each Scheduler replica. The procedure below applies whenever you need to grow existing PVCs, regardless of their starting size. These will look similar to the following, where in this example we are running Scheduler in HA mode.

NAMESPACE     NAME                                              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
dapr-system   dapr-scheduler-data-dir-dapr-scheduler-server-0   Bound    pvc-9f699d2e-f347-43b0-aa98-57dcf38229c5   1Gi        RWO            standard       <unset>                 3m25s
dapr-system   dapr-scheduler-data-dir-dapr-scheduler-server-1   Bound    pvc-f4c8be7b-ffbe-407b-954e-7688f2482caa   1Gi        RWO            standard       <unset>                 3m25s
dapr-system   dapr-scheduler-data-dir-dapr-scheduler-server-2   Bound    pvc-eaad5fb1-98e9-42a5-bcc8-d45dba1c4b9f   1Gi        RWO            standard       <unset>                 3m25s
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                         STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
pvc-9f699d2e-f347-43b0-aa98-57dcf38229c5   1Gi        RWO            Delete           Bound    dapr-system/dapr-scheduler-data-dir-dapr-scheduler-server-0   standard       <unset>                          4m24s
pvc-eaad5fb1-98e9-42a5-bcc8-d45dba1c4b9f   1Gi        RWO            Delete           Bound    dapr-system/dapr-scheduler-data-dir-dapr-scheduler-server-2   standard       <unset>                          4m24s
pvc-f4c8be7b-ffbe-407b-954e-7688f2482caa   1Gi        RWO            Delete           Bound    dapr-system/dapr-scheduler-data-dir-dapr-scheduler-server-1   standard       <unset>                          4m24s

To expand the storage size of the Scheduler, follow these steps:

  1. First, ensure that the storage class supports volume expansion, and that the allowVolumeExpansion field is set to true if it is not already.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: my.driver
allowVolumeExpansion: true
...
  1. Delete the Scheduler StatefulSet whilst preserving the Bound Persistent Volume Claims.
kubectl delete sts -n dapr-system dapr-scheduler-server --cascade=orphan
  1. Increase the size of the Persistent Volume Claims to the desired size by editing the spec.resources.requests.storage field. Again in this case, we are assuming that the Scheduler is running in HA mode with 3 replicas.
kubectl edit pvc -n dapr-system dapr-scheduler-data-dir-dapr-scheduler-server-0 dapr-scheduler-data-dir-dapr-scheduler-server-1 dapr-scheduler-data-dir-dapr-scheduler-server-2
  1. Recreate the Scheduler StatefulSet by installing Dapr with the desired storage size.

Storage Class

In case your Kubernetes deployment does not have a default storage class or you are configuring a production cluster, defining a storage class is required.

A persistent volume is backed by a real disk that is provided by the hosted Cloud Provider or Kubernetes infrastructure platform. Disk size is determined by how many jobs are expected to be persisted at once; however, 64Gb should be more than sufficient for most production scenarios.

For production, use a premium SSD-backed storage class to give Etcd the IOPS and latency profile it requires. On lower-tier storage classes the Scheduler’s embedded Etcd can log slow-disk heartbeat warnings (leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk).

Where supported, also prefer storage classes that support multi-zone failover (for example, zone-redundant or regional persistent disks) so Scheduler PVCs are not locked to a single availability zone. Zone-locked PVCs can block Scheduler recovery during cluster upgrades or zonal disruption until the original zone becomes available again.

Some Kubernetes providers recommend using a CSI driver to provision the underlying disks. Below are a list of useful links to the relevant documentation for creating a persistent disk for the major cloud providers:

Once the storage class is available, you can install Dapr using the following command, with Scheduler configured to use the storage class (replace my-storage-class with the name of the storage class):

dapr init -k --set dapr_scheduler.cluster.storageClassName=my-storage-class
helm upgrade --install dapr dapr/dapr \
--version=1.18 \
--namespace dapr-system \
--create-namespace \
--set dapr_scheduler.cluster.storageClassName=my-storage-class \
--wait

Ephemeral Storage

When running in non-HA mode, the Scheduler can be optionally made to use ephemeral storage, which is in-memory storage that is not resilient to restarts. For example, all jobs data is lost after a Scheduler restart. This is useful in non-production deployments or for testing where storage is not available or required.

dapr init -k --set dapr_scheduler.cluster.inMemoryStorage=true
helm upgrade --install dapr dapr/dapr \
--version=1.18 \
--namespace dapr-system \
--create-namespace \
--set dapr_scheduler.cluster.inMemoryStorage=true \
--wait