Skip to content

CI First step towards moving Azure CI to GHA #31832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 7, 2025

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Jul 24, 2025

  • this changes the arm GHA workflow to have a matrix, so that it is easier in the future to move one build at at time
  • this reuses the Azure scripts install.sh and test_script.sh

Plenty of stuff left for the future:

  • ccache cache setup
  • codecov upload
  • changes of behaviour based on commit markers e.g. [float32]
  • other things that I did not anticipate

Copy link

github-actions bot commented Jul 24, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: e2702cf. Link to the linter CI: here

@lesteve lesteve marked this pull request as ready for review July 25, 2025 06:58
@lesteve
Copy link
Member Author

lesteve commented Jul 25, 2025

So the first thing is that the required check for "Run unit tests" is pending because this check does not exist any more in this PR. I propose that we removed this required check to be able to merge this PR, setting is here.

Required checks are set on a job level, medium term I think we would need a job that aggregates all matrix entries results so that we can make the required checks on this single aggregating job. See https://github.com/orgs/community/discussions/4324#discussioncomment-13335158 or there is the GitHub action https://github.com/re-actors/alls-green that seems to handle this use case.

@ogrisel
Copy link
Member

ogrisel commented Aug 5, 2025

Why is the global "Run unit tests" step stuck with "Expected — Waiting for status to be reported" while the "Lint" and "pymin_conda_forge_arm (pull_request)" are reported as successful?

@ogrisel
Copy link
Member

ogrisel commented Aug 5, 2025

Nice to see that the ARM runner is so fast even without ccache enabled.

@ogrisel
Copy link
Member

ogrisel commented Aug 5, 2025

We should rename the workflows with a convention that makes the operating system visible in the report UI:

image

So far we only run the arm tests on linux, but this could change in the future.

@lesteve
Copy link
Member Author

lesteve commented Aug 5, 2025

Why is the global "Run unit tests" step stuck with "Expected — Waiting for status to be reported" while the "Lint" and "pymin_conda_forge_arm (pull_request)" are reported as successful?

The required check is based on a job status named "Run unit tests", with this PR the equivalent Linux arm job is named differently (e.g. "Linux pymin_conda_forge_arm" in the last commit). The required check is waiting for a job status named "Run unit tests" that will never come.

See my earlier comment #31832 (comment) where I tried to explain in more details.

@ogrisel
Copy link
Member

ogrisel commented Aug 5, 2025

Sorry for not having read your previous comment... Let me finalize my review.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine merging this as is without waiting to implement all the TODO items and improve the workflow definition iteratively with small follow-up PRs.

activate-environment: ""

- name: Build scikit-learn
run: bash -l build_tools/azure/install.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can rename the "azure" folder once the migration is complete but it's fine to keep it unchanged during the transition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that was my plan indeed to move the script once the full migration to GHA is finished.

if build_source_version_message is None:
# We are not on Azure: behaviour based on commit-message is not
# supported for now.
# TODO: this should be implemented at one point for GHA.
Copy link
Member

@ogrisel ogrisel Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked perplexity about this and it suggested this:

https://www.perplexity.ai/search/how-to-condition-the-execution-4hfi.cTjT9.uI60Jb2iJtg

name: Conditional Job on PR Commit Message

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  extract_commit_message:
    runs-on: ubuntu-latest
    outputs:
      matched: ${{ steps.match.outputs.matched }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 2 # fetch enough history

      - name: Get latest commit message
        id: commit_msg
        run: |
          msg="$(git log -1 --pretty=format:'%s')"
          echo "message=$msg" >> $GITHUB_OUTPUT

      - name: Evaluate message contents
        id: match
        run: |
          if [[ "${{ steps.commit_msg.outputs.message }}" == *"[trigger-job]"* ]]; then
            echo "matched=true" >> $GITHUB_OUTPUT
          else
            echo "matched=false" >> $GITHUB_OUTPUT
          fi
  conditional_job:
    needs: extract_commit_message
    if: needs.extract_commit_message.outputs.matched == 'true'
    runs-on: ubuntu-latest
    steps:
      - run: echo "Job is executing because the commit message contains [trigger-job]"

Note that the above does not explicitly sanitize the contents of the commit message before injecting it into the $GITHUB_OUTPUT but I don't know if any sanitization is needed or not in this case but it seems that GITHUB_OUTPUT output is not meant to store multiple line payloads.

Maybe a better way to do it would be to write a single Python or bash script that is both in charge of looking up the git commit message and then matching it with a and only printing "true" or "false" to avoid having to store the unsanitized commit message into $GITHUB_OUTPUT as an intermediate step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried at one point to have something that worked for both Azure and GHA and realized this was more complicated than I hoped. I thought it was better to first have the basic structure in place and tackle each problem separately in further PRs.

Note that at the moment, we have already a commit-based trigger in GHA for [cd build] in wheels.yml, that seems somewhat similar to your suggestion.

@ogrisel ogrisel merged commit 1a6e34c into scikit-learn:main Aug 7, 2025
36 checks passed
@ogrisel
Copy link
Member

ogrisel commented Aug 7, 2025

@lesteve merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants