-
Notifications
You must be signed in to change notification settings - Fork 5.5k
[Bug #20001] Fix GC issues with ASAN fake stacks (v2) #9505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug #20001] Fix GC issues with ASAN fake stacks (v2) #9505
Conversation
@eightbitraptor to pick you more or less at random amongst people who know about the GC ... :) Would you mind giving a quick once-over of this PR? I would like a second opinion on the changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick once over as requested and some feedback inline 👍
I'm not at all familiar with how asan works, so it might also be useful to get @tenderlove or @peterzhu2118 to take a look at this.
eval.c
Outdated
return 0; | ||
|
||
ruby_init_stack((void *)&state); | ||
ruby_init_stack(&state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the ruby_tag_type
enum here right? and then treating it like an address, and writing to it inside ruby_init_stack
. Why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale here for the main thread is the same as for non-main threads:
I agree the type and variable name is a little strange. I renamed all of this punching through as local_in_parent_frame
for the non-main thread case, perhaps I should do the same thing here as well (and duplicate the contents of that comment).
eval.c
Outdated
prctl(PR_SET_THP_DISABLE, 1, 0, 0, 0); | ||
#endif | ||
Init_BareVM(); | ||
Init_BareVM(&state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here. Do we have to wind the address all the way down to native_thread_init_stack
?
gc.c
Outdated
{ | ||
gc_mark_maybe(objspace, obj); | ||
#ifdef RUBY_ASAN_ENABLED | ||
rb_execution_context_t *ec = ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to each_location
and the callback signature make sense to me. However, It seems a shame to introduce quite a lot of extra machinery in order to support drilling down a pointer that will only ever be used when RUBY_ASAN_ENABLED
.
Have you looked at whether this has any impact on marking performance? Is there a way we could make this zero overhead if asan is not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be safe to just stash a pointer to the execution context we're currently marking on objspace
. That would be zero-cost if ASAN is not enabled, and avoid a lot of the messy diff here. I'll do that instead.
gc.c
Outdated
void (*cb)(rb_objspace_t *, void *, VALUE)); | ||
|
||
static void | ||
gc_mark_machine_stack_location_maybe(rb_objspace_t *objspace, void *ctx, VALUE obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be helpful for me if the intentional uses for ctx
were documented in the sites where they were used. It's not immediately obvious what the intention of it is given that in the majority of callbacks it's unused.
ruby_init_stack(volatile void *addr) | ||
{ | ||
native_main_thread.id = pthread_self(); | ||
addr = asan_get_real_stack_addr((void *)addr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me why we're using functions directly referencing asan in places where we don't know whether asan is available or enabled or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pattern for most of the existing ASAN macros is for them to be no-ops if ASAN is disabled, and to elide the checks where they're used. I guess the motivation for this is to avoid spraying a lot of conditional compilation checks all throughout the codebase, like:
#if RUBY_ASAN_ENABLED
void *poisoned = asan_unpoison_object_temporary(val)
#endif
/* use val */
#if RUBY_ASAN_ENABLED
if (poisoned) { asan_poison_object(val); }
#endif
For these more specialised macros only used in a couple of places though, perhaps it's better to be explicit and put the check here.
stack beginning from the pthread API in native_thread_init_stack, because | ||
glibc stores some of its own data on the stack before calling into user code | ||
on a new thread, and replacing that data on fiber-switch would break it (see | ||
bug #13887) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice explanation! Thanks 👍
Thanks for the review 🙏 I'll make some adjustments. |
09c83f4
to
99bbb74
Compare
if (GET_VM()) | ||
return 0; | ||
|
||
ruby_init_stack((void *)&state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these calls to ruby_init_stack
were doing absolutely nothing, because the initial call to INIT_STACK
in main.c
already called ruby_init_stack
with a higher address.
This commit changes how stack extents are calculated for both the main thread and other threads. Ruby uses the address of a local variable as part of the calculation for machine stack extents: * pthreads uses it as a lower-bound on the start of the stack, because glibc (and maybe other libcs) can store its own data on the stack before calling into user code on thread creation. * win32 uses it as an argument to VirtualQuery, which gets the extent of the memory mapping which contains the variable However, the local being used for this is actually too low (too close to the leaf function call) in both the main thread case and the new thread case. In the main thread case, we have the `INIT_STACK` macro, which is used for pthreads to set the `native_main_thread->stack_start` value. This value is correctly captured at the very top level of the program (in main.c). However, this is _not_ what's used to set the execution context machine stack (`th->ec->machine_stack.stack_start`); that gets set as part of a call to `ruby_thread_init_stack` in `Init_BareVM`, using the address of a local variable allocated _inside_ `Init_BareVM`. This is too low; we need to use a local allocated closer to the top of the program. In the new thread case, the lolcal is allocated inside `native_thread_init_stack`, which is, again, too low. In both cases, this means that we might have VALUEs lying outside the bounds of `th->ec->machine.stack_{start,end}`, which won't be marked correctly by the GC machinery. To fix this, * In the main thread case: We already have `INIT_STACK` at the right level, so just pass that local var to `ruby_thread_init_stack`. * In the new thread case: Allocate the local one level above the call to `native_thread_init_stack` in `call_thread_start_func2`. [Bug #20001] fix
8d46003
to
c21a0c1
Compare
@tenderlove or @peterzhu2118 - would either of you mind having a look at this as well as matt suggested? Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense 👍
thread_pthread.c
Outdated
native_thread_init_main_thread_stack(void *addr) | ||
{ | ||
native_main_thread.id = pthread_self(); | ||
#ifdef RUBYASAN_ENABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo:
#ifdef RUBYASAN_ENABLED | |
#ifdef RUBY_ASAN_ENABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, thank you.
Awesome, thanks so much for your reviews. |
Where a local variable is used as part of the stack bounds detection, it has to actually be on the stack. ASAN can put local variable on "fake stacks", however, with addresses in different memory mappings. This completely destroys the stack bounds calculation, and can lead to e.g. things not getting GC marked on the machine stack or stackoverflow checks that always fail. The __asan_addr_is_in_fake_stack helper can be used to get the _real_ stack address of such variables, and thus perform the stack size calculation properly [Bug #20001]
__has_feature is a clang-ism, and GCC has a different way to tell if sanitizers are enabled. For this reason, I don't want to spray __has_feature all over the codebase for other places where conditional compilation based on sanitizers is required. [Bug #20001]
ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]
c21a0c1
to
8cae62a
Compare
(this is #8903 again - I just accidently merged it too soon before)
This fixes some of the issues with ASAN I outlined in https://bugs.ruby-lang.org/issues/20001
With this merged,
make
now works on my machine with the following:Without these fixes, I get crashes in various steps which use the built miniruby (e.g. parsing rdoc or runing extconf files) because
VALUE
's on the machine stack are not getting properly marked (because they are instead living in ASAN fake stacks).This seems to build fine with ASAN both enabled and disabled on linux/mac, and builds without asan on windows (i wasn't brave enough to try turning asan on with msvc heh).
[Bug #20001]