Skip to content

fix: handle corner cases in the async preemption #133213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sanposhiho
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

See #133167, which was reverted due to a failing test. This PR contains additional fix to a test as well.

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 25, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jul 25, 2025
@k8s-ci-robot k8s-ci-robot requested review from dom4ha and kerthcet July 25, 2025 10:04
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 25, 2025
@sanposhiho
Copy link
Member Author

/hold

Need approver's approval + I'd like to make sure if the test pass several times.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 25, 2025
@sanposhiho sanposhiho force-pushed the second-trial-conor branch from 1f29ab1 to f6cc41c Compare July 25, 2025 10:19
Comment on lines 478 to 502
errCh.SendErrorWithCancel(err, cancel)
errCh.SendError(err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't cancel context if the error is not found. So, I changed here to SendError and then cancel() at the receiver side

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Canceling the context has another purpose as well - it terminates the Parallelizer().Until, i.e. if there is an error, we stop preempting the rest of the pods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think we need to check apierrors.IsNotFound(err) here and handle it appropriately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so calling cancel() at the receiver side is equivalent, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the not found error here would be a bit tricky and I prefer the current impl, if you don't have the objection there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so calling cancel() at the receiver side is equivalent, right?

No, Parallelizer().Until() blocks until all goroutines are finished. Then, the error is checked. So, in the current implementation, you would cancel the context after all pods were tried to be preempted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the error channel should be called only once (as the length of the buffer is 1). Currently, if one preemptPod would return Not Found error and second will return another error, we would only handle the Not Found one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, the simplest would be just to set an allPodsAlreadyDeleted boolean atomic in preemptPod

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, Parallelizer().Until() blocks until all goroutines are finished.

It's a good point, fixed.

Comment on lines +1572 to +1575
st.MakePod().Name("medium").Priority(mediumPriority).Req(map[v1.ResourceName]string{v1.ResourceCPU: "3"}).Obj(),
},
{
st.MakePod().Name("high").Priority(highPriority).Req(map[v1.ResourceName]string{v1.ResourceCPU: "3"}).Obj(),
st.MakePod().Name("high").Priority(highPriority).Req(map[v1.ResourceName]string{v1.ResourceCPU: "4"}).Obj(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't matter. it's just that the test name didn't match what it tests

cancel()
result = metrics.GoroutineResultError
default:
allPodsAlreadyDeleted = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't "disable activate on finish" here, because we don't rely on notifications sent from pods that were removed asynchronously. We explicitly call removing the last pod synchronously to have a gurantee we don't miss the notification.

For that reason I'd rename allPodsAlreadyDeleted to something like noGuaranteedNotification, which would be false at the beginning and will turn to true after we send the last removal synchronously.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We explicitly call removing the last pod synchronously to have a gurantee we don't miss the notification.

Deleting the last pod synchronously is for completely another scenario: if all pods are deleted async and all pod/delete events arrive at the queue immediately, before the preemptor pod is un-gated, the preemptor pod could miss all pod/delete events.

What we need to do is "when all pods are already deleted after all, activate the preemptor pod". Because, if all pods are already deleted, the preemptor pod might miss pod/delete while it's being gated.
So, I don't think your proposed change would cover the corner case that we're solving here.

@sanposhiho sanposhiho force-pushed the second-trial-conor branch from 1d3ee06 to 6d11c36 Compare July 28, 2025 09:59
@sanposhiho sanposhiho requested a review from macsko July 28, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants