Autoscale Profile

3 min read

An autoscale profile is a named preset that expands into a complete operatorBox.autoscale block at Katalog load time.

Profiles are relative — they use the CRD’s declared workers and queue.maxQueueDepth as a baseline, then compute worker overrides, queue overrides, trigger thresholds, and timing from that baseline.


Profiles

ProfileTriggerWorkers overrideQueue overrideIntervalCooldown
burstqueueDepth > 60% of override queueDepthbaseline × 4baseline × 105s30s
steadyqueueDepth > 40% AND workersBusy > 70%baseline × 2baseline × 330s2m
batchcron window 23:00 → 02:00baseline × 3baseline × 860s5m
latency-sensitiveP95 reconcile > 200ms⌈baseline × 2.5⌉15s1m
cost-optimizedworkersIdle > 60% AND queueDepth > 80%max(1, baseline × 0.5)baseline × 0.530s10m

Usage

operatorBox:
  workers: 4
  queue:
    maxQueueDepth: 100
  autoscale:
    profile: steady

For steady with the above baseline (workers=4, queueDepth=100):

  • override workers = 4 × 2 = 8
  • override queueDepth = 100 × 3 = 300
  • trigger threshold = 300 × 40% = 120
  • trigger condition: queueDepth > 120 AND workersBusyPercent > 70

Rules

Profile or explicit — not both.

# Valid
autoscale:
  profile: burst

# Valid
autoscale:
  interval: 10s
  cooldown: 1m
  conditions:
    when:
      - field: metrics.queueDepth
        greaterThan: "500"
  do:
    workers: 16

# Invalid — rejected at load time
autoscale:
  profile: burst
  interval: 10s

Unknown profiles fail fast. An unrecognized name is a Katalog load error.


Profile details

burst

React instantly to spikes. Aggressive scaling, short interval, short cooldown.

trigger:  queueDepth > 60% of override queueDepth
override: workers × 4, queueDepth × 10
timing:   interval 5s, cooldown 30s

Use when your operator handles unpredictable bursts and you need it to scale out before the queue fills.

steady

Smooth, predictable scaling. Requires both queue pressure and worker saturation before scaling.

trigger:  queueDepth > 40% of override queueDepth AND workersBusyPercent > 70
override: workers × 2, queueDepth × 3
timing:   interval 30s, cooldown 2m

Use for production operators with consistent load patterns.

batch

Scale for a nightly processing window. Time-triggered, not load-triggered.

trigger:  cron "0 23 * * *", duration 3h
override: workers × 3, queueDepth × 8
timing:   interval 60s, cooldown 5m

Use when you have a known batch window (nightly jobs, end-of-day processing).

latency-sensitive

Keep reconcile latency low. Triggered by P95 latency, not queue depth.

trigger:  reconcileDurationP95Ms > 200
override: workers × 2.5 (ceiling)
timing:   interval 15s, cooldown 1m

Use when users are sensitive to reconcile delay and you want to scale out before latency compounds.

cost-optimized

Minimize resource usage during low activity. Scale down aggressively when idle.

trigger:  workersIdlePercent > 60 AND queueDepth > 80% of override queueDepth
override: max(1, workers × 0.5), queueDepth × 0.5
timing:   interval 30s, cooldown 10m

Use for operators with long idle periods between bursts, where you want to give back resources between processing windows.


Choosing a profile

SituationProfile
Unpredictable spikes, must not queue-blockburst
Consistent throughput, standard productionsteady
Known batch window (nightly, weekly)batch
Latency SLO, users feel slow reconcileslatency-sensitive
Long idle periods, cost matterscost-optimized

If none fit, write the autoscale block manually.