-
Notifications
You must be signed in to change notification settings - Fork 41.4k
Resolve confusing use of TooManyRequests error for eviction #133097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve confusing use of TooManyRequests error for eviction #133097
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hi @kei01234kei. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/sig apps |
Let me also assign you as reviewers because I saw you in the issue discussion. |
/ok-to-test |
err := errors.NewTooManyRequests("Cannot evict pod as it would violate the pod's disruption budget.", 0) | ||
err.ErrStatus.Details.Causes = append(err.ErrStatus.Details.Causes, metav1.StatusCause{Type: policyv1.DisruptionBudgetCause, Message: fmt.Sprintf("The disruption budget %s needs %d healthy pods and has %d currently", pdb.Name, pdb.Status.DesiredHealthy, pdb.Status.CurrentHealthy)}) | ||
condition := meta.FindStatusCondition(pdb.Status.Conditions, policyv1.DisruptionAllowedCondition) | ||
if condition.Status == metav1.ConditionFalse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
condition.Status
will panic if condition is nil, which FindStatusCondition will return if the condition is not present
I'd suggest customizing how we construct the message based on CurrentHealthy / DesiredHealthy / presence of a SyncFailedReason or other False condition, with a sensible generic fallback to avoid being confusing. I'd also suggest keeping the existing message as-is if CurrentHealthy <= DesiredHealthy since that is not confusing.
condition := meta.FindStatusCondition(pdb.Status.Conditions, policyv1.DisruptionAllowedCondition)
var msg string
switch {
case pdb.Status.CurrentHealthy <= pdb.Status.DesiredHealthy:
msg = fmt.Sprintf("The disruption budget %s needs %d healthy pods and has %d currently", pdb.Name, pdb.Status.DesiredHealthy, pdb.Status.CurrentHealthy)
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0 && condition.Reason == policy.SyncFailedReason:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently because it failed sync: %v", pdb.Name, condition.Message)
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently: %v", pdb.Name, condition.Message)
default:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently", pdb.Name)
}
err.ErrStatus.Details.Causes = append(err.ErrStatus.Details.Causes, metav1.StatusCause{Type: policyv1.DisruptionBudgetCause, Message: msg})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the above flow
Perhaps, this part could also output conditions without a condition.Message
and print also the condition.Reason
switch {
...
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently: %v", pdb.Name, condition.Message)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt
Thank you for the good advice. I modified the code following your advice.
@atiratree
I changed the code to output the condition.Reason
too, in condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0
case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kei01234kei It would be great if we could test the new errors. I think we can add a new cases to this unit test
func TestEvictionIgnorePDB(t *testing.T) { |
Btw, the name
TestEvictionIgnorePDB
does not describe all of its test cases well anymore. Because not all the cases ignore PDBs. The easiest way to fix this is as follows, IMO:
s/TestEviction/TestEvictionWithETCD
s/TestEvictionIgnorePDB/TestEviction
err := errors.NewTooManyRequests("Cannot evict pod as it would violate the pod's disruption budget.", 0) | ||
err.ErrStatus.Details.Causes = append(err.ErrStatus.Details.Causes, metav1.StatusCause{Type: policyv1.DisruptionBudgetCause, Message: fmt.Sprintf("The disruption budget %s needs %d healthy pods and has %d currently", pdb.Name, pdb.Status.DesiredHealthy, pdb.Status.CurrentHealthy)}) | ||
condition := meta.FindStatusCondition(pdb.Status.Conditions, policyv1.DisruptionAllowedCondition) | ||
if condition.Status == metav1.ConditionFalse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the above flow
Perhaps, this part could also output conditions without a condition.Message
and print also the condition.Reason
switch {
...
case condition != nil && condition.Status == metav1.ConditionFalse && len(condition.Message) > 0:
msg = fmt.Sprintf("The disruption budget %s does not allow evicting pods currently: %v", pdb.Name, condition.Message)
...
ef05923
to
ef94aed
Compare
The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass. This bot retests PRs for certain kubernetes repos according to the following rules:
You can:
/retest |
/hold |
/label tide/merge-method-squash |
ac72fae
to
aff1940
Compare
Yes. I modified the test. |
e9085b3
to
aff1940
Compare
thanks, please squash to a single commit |
modify test "the error includes the reason when the condition.Status is False"
aff1940
to
d014398
Compare
Done. |
/retest |
/lgtm |
LGTM label has been added. Git tree hash: 7c5e5d13cb5cebc0e93e7dba9e956eaec5e9b5a9
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kei01234kei, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
What type of PR is this?
/kind bug
What this PR does / why we need it:
To resolve the issue "Confusing use of TooManyRequests error for eviction."
Which issue(s) this PR is related to:
Fixes #106286
Special notes for your reviewer:
#106286 (comment)
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: