-
Notifications
You must be signed in to change notification settings - Fork 823
Improve Status performance with many ignored files #1379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@silkeh thanks for proposing this PR. Although this may improve the use case of multiple ignored files, it seems to cause a regression when that is not the case. I did a quick benchmark testing a call to
I am struggling for time to pinpoint optimisation areas in the PR, would you be able to optimise it for the standard use case as well? |
@pjbgf: Thanks for taking a look! I'll take a look later this week and see what I can improve. |
@onee-only: That was the plan, but I've been busy with other things. I'll try to make some time next week. |
Fix loading of `.gitignore` files in nested ignored directories. These were not matched in the current implementation, as that only checks the ignores of the current directory. As a side-effect this change also leads to significantly improved performance: in a repository with 3M ignored files `Status` now takes 160 s instead of 491 s.
25c3930
to
269fa79
Compare
Skip ignored files when walking through the worktree. This signigifantly improves the performance of `Status()`: In a repository with 3M ignored files `Status` now takes 5 s instead of 160 s.
269fa79
to
dcfb50d
Compare
I finally had some time to take a look and rebase the PR. The difference compared to
v5.16.2 is a lot faster, but this PR in its current form doesn't hurt performance too much. If you test with a repo on disk it actually seems to be a bit faster. Benchmark I used: func BenchmarkStatus(b *testing.B) {
r, err := Clone(memory.NewStorage(), memfs.New(), &CloneOptions{
URL: "https://github.com/go-git/go-git",
})
require.NoError(b, err)
workTree, err := r.Worktree()
require.NoError(b, err)
_, err = workTree.Status()
require.NoError(b, err)
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
workTree.Status()
}
} If I read the profiling correctly most of the performance degradation is introduced in ![]() I did a tiny optimization to reduce the allocations a bit, but the rest of the performance improvements would probably have to be done in the |
This PR contains two changes that significantly improve the performance of
WorkTree.Status()
(from 491 s to 5 s):This signigifantly improves the performance of
Status()
:In a repository with 3M ignored files
Status
now takes 5 s instead of 160 s..gitignore
files in nested ignored directories.These were not matched in the current implementation, as that only checks the ignores of the current directory.
As a side-effect this change also leads to significantly improved performance: in a repository with 3M ignored files
Status
now takes 160 s instead of 491 s.I have a backport to
master
ready if desired.