Skip to content

Commit 5dbfaaf

Browse files
arjunrnjosephburnett
authored andcommitted
Configurable Scaling for the HPA (kubernetes#18157)
* Add configurable scale behavior. * Added documentation about configurable scaling in the HPA Signed-off-by: Arjun Naik <[email protected]> Co-authored-by: Joseph Burnett <[email protected]>
1 parent c192154 commit 5dbfaaf

File tree

1 file changed

+164
-9
lines changed

1 file changed

+164
-9
lines changed

content/en/docs/tasks/run-application/horizontal-pod-autoscale.md

Lines changed: 164 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ reviewers:
33
- fgrzadkowski
44
- jszczepkowski
55
- directxman12
6+
- josephburnett
67
title: Horizontal Pod Autoscaler
78
feature:
89
title: Horizontal scaling
@@ -162,11 +163,15 @@ can be fetched, scaling is skipped. This means that the HPA is still capable
162163
of scaling up if one or more metrics give a `desiredReplicas` greater than
163164
the current value.
164165

165-
Finally, just before HPA scales the target, the scale recommendation is recorded. The
166-
controller considers all recommendations within a configurable window choosing the
167-
highest recommendation from within that window. This value can be configured using the `--horizontal-pod-autoscaler-downscale-stabilization` flag, which defaults to 5 minutes.
168-
This means that scaledowns will occur gradually, smoothing out the impact of rapidly
169-
fluctuating metric values.
166+
Finally, just before HPA scales the target, the scale recommendation is
167+
recorded. The controller considers all recommendations within a configurable
168+
window choosing the highest recommendation from within that window. This value
169+
can be configured using the
170+
`--horizontal-pod-autoscaler-downscale-stabilization` flag or the HPA object
171+
behavior `behavior.scaleDown.stabilizationWindowSeconds` (see [Support for
172+
configurable scaling behavior](#support-for-configurable-scaling-behavior)),
173+
which defaults to 5 minutes. This means that scaledowns will occur gradually,
174+
smoothing out the impact of rapidly fluctuating metric values.
170175

171176
## API Object
172177

@@ -213,10 +218,7 @@ When managing the scale of a group of replicas using the Horizontal Pod Autoscal
213218
it is possible that the number of replicas keeps fluctuating frequently due to the
214219
dynamic nature of the metrics evaluated. This is sometimes referred to as *thrashing*.
215220

216-
Starting from v1.6, a cluster operator can mitigate this problem by tuning
217-
the global HPA settings exposed as flags for the `kube-controller-manager` component:
218-
219-
Starting from v1.12, a new algorithmic update removes the need for the
221+
Starting from v1.12, a new algorithmic update removes the need for an
220222
upscale delay.
221223

222224
- `--horizontal-pod-autoscaler-downscale-stabilization`: The value for this option is a
@@ -232,6 +234,11 @@ the delay value is set too short, the scale of the replicas set may keep thrashi
232234
usual.
233235
{{< /note >}}
234236

237+
Starting from v1.17 the downscale stabilization window can be set on a per-HPA
238+
basis by setting the `behavior.scaleDown.stabilizationWindowSeconds` field in
239+
the v2beta2 API. See [Support for configurable scaling
240+
behavior](#support-for-configurable-scaling-behavior).
241+
235242
## Support for multiple metrics
236243

237244
Kubernetes 1.6 adds support for scaling based on multiple metrics. You can use the `autoscaling/v2beta2` API
@@ -282,6 +289,154 @@ and [external.metrics.k8s.io](https://github.com/kubernetes/community/blob/maste
282289
For examples of how to use them see [the walkthrough for using custom metrics](/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-multiple-metrics-and-custom-metrics)
283290
and [the walkthrough for using external metrics](/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-metrics-not-related-to-kubernetes-objects).
284291

292+
## Support for configurable scaling behavior
293+
294+
Starting from
295+
[v1.17](https://github.com/kubernetes/enhancements/blob/master/keps/sig-autoscaling/20190307-configurable-scale-velocity-for-hpa.md)
296+
the `v2beta2` API allows scaling behavior to be configured through the HPA
297+
`behavior` field. Behaviors are specified separately for scaling up and down in
298+
`scaleUp` or `scaleDown` section under the `behavior` field. A stabilization
299+
window can be specified for both directions which prevents the flapping of the
300+
number of the replicas in the scaling target. Similarly specifing scaling
301+
policies controls the rate of change of replicas while scaling.
302+
303+
### Scaling Policies
304+
305+
One or more scaling policies can be specified in the `behavior` section of the spec.
306+
When multiple policies are specified the policy which allows the highest amount of
307+
change is the policy which is selected by default. The following example shows this behavior
308+
while scaling down:
309+
310+
```yaml
311+
behavior:
312+
scaleDown:
313+
policies:
314+
- type: Pods
315+
value: 4
316+
periodSeconds: 60
317+
- type: Percent
318+
value: 10
319+
periodSeconds: 60
320+
```
321+
322+
When the number of pods is more than 40 the second policy will be used for scaling down.
323+
For instance if there are 80 replicas and the target has to be scaled down to 10 replicas
324+
then during the first step 8 replicas will be reduced. In the next iteration when the number
325+
of replicas is 72, 10% of the pods is 7.2 but the number is rounded up to 8. On each loop of
326+
the autoscaler controller the number of pods to be change is re-calculated based on the number
327+
of current replicas. When the number of replicas falls below 40 the first policy_(Pods)_ is applied
328+
and 4 replicas will be reduced at a time.
329+
330+
`periodSeconds` indicates the length of time in the past for which the policy must hold true.
331+
The first policy allows at most 4 replicas to be scaled down in one minute. The second policy
332+
allows at most 10% of the current replicas to be scaled down in one minute.
333+
334+
The policy selection can be changed by specifying the `selectPolicy` field for a scaling
335+
direction. By setting the value to `Min` which would select the policy which allows the
336+
smallest change in the replica count. Setting the value to `Disabled` completely disabled
337+
scaling in that direction.
338+
339+
### Stabilization Window
340+
341+
The stabilization window is used to retrict the flapping of replicas when the metrics
342+
used for scaling keep fluctuating. The stabilization window is used by the autoscaling
343+
algorithm to consider the computed desired state from the past to prevent scaling. In
344+
the following example the stabilization window is specified for `scaleDown`.
345+
346+
```yaml
347+
scaleDown:
348+
stabilizationWindowSeconds: 300
349+
```
350+
351+
When the metrics indicate that the target should be scaled down the algorithm looks
352+
into previously computed desired states and uses the highest value from the specified
353+
interval. In above example all desired states from the past 5 minutes will be considered.
354+
355+
### Default Behavior
356+
357+
To use the custom scaling not all fields have to be specified. Only values which need to be
358+
customized can be specified. These custom values are merged with default values. The default values
359+
match the existing behavior in the HPA algorithm.
360+
361+
```yaml
362+
behavior:
363+
scaleDown:
364+
stabilizationWindowSeconds: 300
365+
policies:
366+
- type: Percent
367+
value: 100
368+
periodSeconds: 15
369+
scaleUp:
370+
stabilizationWindowSeconds: 0
371+
policies:
372+
- type: Percent
373+
value: 100
374+
periodSeconds: 15
375+
- type: Pods
376+
value: 4
377+
periodSeconds: 15
378+
selectPolicy: Max
379+
```
380+
For scaling down the stabilization window is _300_ seconds(or the value of the
381+
`--horizontal-pod-autoscaler-downscale-stabilization` flag if provided). There is only a single policy
382+
for scaling down which allows a 100% of the currently running replicas to be removed which
383+
means the scaling target can be scaled down to the minimum allowed replicas.
384+
For scaling up there is no stabilization window. When the metrics indicate that the target should be
385+
scaled up the target is scaled up immediately. There are 2 policies which. 4 pods or a 100% of the currently
386+
running replicas will be added every 15 seconds till the HPA reaches its steady state.
387+
388+
### Example: change downscale stabilization window
389+
390+
To provide a custom downscale stabilization window of 1 minute, the following
391+
behavior would be added to the HPA:
392+
393+
```yaml
394+
behavior:
395+
scaleDown:
396+
stabilizationWindowSeconds: 60
397+
```
398+
399+
### Example: limit scale down rate
400+
401+
To limit the rate at which pods are removed by the HPA to 10% per minute, the
402+
following behavior would be added to the HPA:
403+
404+
```yaml
405+
behavior:
406+
scaleDown:
407+
policies:
408+
- type: Percent
409+
value: 10
410+
periodSeconds: 60
411+
```
412+
413+
To allow a final drop of 5 pods, another policy can be added and a selection
414+
strategy of minimum:
415+
416+
```yaml
417+
behavior:
418+
scaleDown:
419+
policies:
420+
- type: Percent
421+
value: 10
422+
periodSeconds: 60
423+
- type: Pods
424+
value: 5
425+
periodSeconds: 60
426+
selectPolicy: Max
427+
```
428+
429+
### Example: disable scale down
430+
431+
The `selectPolicy` value of `Disabled` turns off scaling the given direction.
432+
So to prevent downscaling the following policy would be used:
433+
434+
```yaml
435+
behavior:
436+
scaleDown:
437+
selectPolicy: Disabled
438+
```
439+
285440
{{% /capture %}}
286441

287442
{{% capture whatsnext %}}

0 commit comments

Comments
 (0)