-
Notifications
You must be signed in to change notification settings - Fork 41.8k
Optimize 'ContextPairs' nested name splitting #48999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ContextPairs currently uses String.split("\\.") to expand nested names.
String.split is regex-based and allocates per call. Replace it with a
manual dot-delimited scan to reduce allocations and CPU overhead while
preserving existing behavior.
Signed-off-by: Ahmad Al-Sanie [email protected]
Signed-off-by: aalsanie <[email protected]>
|
Thanks for the PR. Unfortunately, I'm not sure that we should merge it as it adds a significant amount of complexity. To consider merging it, I think we'd need to see evidence that the change makes a notable difference to real-world performance. |
|
@wilkinsona Thanks for the feedback, I went and isolated both implementations into the same class and executed them under identical JVM conditions.
Running 10M iterations shows a consistent ~30% reduction in execution time,, the reduced allocation rate also lowers GC pressure.
|
|
Thanks. That's not quite what I meant by real-world and is more of a micro benchmark. To justify the extra complexity, I think we'd need to see an improvement in the performance of structured logging. |
|
Fair enough, I’ll put together a small boot app with structured logging end-to-end and compare baseline. |
|
Structured logging benchmark: https://github.com/aalsanie/logging-sample. baseline:round=1 elapsedMs=44628 Median baseline = 44,628 ms patched:round=1 elapsedMs=43137 Median patched = 43,115 ms summary3.39% faster 1.5s faster per 3M events total allocation samples drop (~4% fewer) run duration drops (~3–4%) Please see attached zip: baseline Vs. patched JFR Impl Notes:
patched snapshot used: |
|
I’ll be honest; I’m not entirely happy with the results. I expected the gain to be higher than this. That said, a 3-4% improvement in a hot structured logging path can benefit a lot of users. On the other hand, I agree with you that the current change adds too much complexity, and your hesitation is reasonable. @SuppressWarnings("unchecked")
void nested(T item, BiConsumer<String, Object> pairs) {
LinkedHashMap<String, Object> result = new LinkedHashMap<>();
this.addedPairs.forEach((addedPair) -> {
addedPair.accept(item, joining((name, value) -> {
Map<String, Object> destination = result;
int end = trimTrailingDelimiters(name);
if (end == 0) {
return;
}
int start = 0;
while (true) {
int dot = name.indexOf('.', start);
if (dot == -1 || dot >= end) {
break;
}
String part = name.substring(start, dot);
Object existing = destination.computeIfAbsent(part, (key) -> new LinkedHashMap<>());
if (!(existing instanceof Map)) {
String common = name.substring(0, dot);
throw new IllegalStateException(
"Duplicate nested pairs added under '%s'".formatted(common));
}
destination = (Map<String, Object>) existing;
start = dot + 1;
}
String leaf = name.substring(start, end);
Object previous = destination.put(leaf, value);
Assert.state(previous == null,
() -> "Duplicate nested pairs added under '%s'".formatted(name));
}));
});
result.forEach(pairs);
}
private int trimTrailingDelimiters(String name) {
int end = name.length();
while (end > 0 && name.charAt(end - 1) == '.') {
end--;
}
return end;
} |
|
What benchmarks do you get if you keep the original code but replace |
|
Unfortunately I couldn't do that, the I had to normalize the input to match what |
ContextPairs nested name splitting|
@aalsanie Would you be willing to force push an update to your PR with the code from your comment above? |


ContextPairsbuilds nested pairs by splitting each name on ".". The current implementation usesString.split("\\."), which is regex-based and allocates an array and list per name.Here I replaced the regex split with a manual dot delimited scan that reduces allocations and avoids regex overhead while preserving existing behavior.
A test has been added.