-
Notifications
You must be signed in to change notification settings - Fork 41
SEP 2 - Add worker_cache #3
Conversation
Code examples are here: https://github.com/saltstack/salt/pull/51687/files I can update with an example showing how to use inside an ext_pillar Cache invalidation is handled by the user-supplied |
If we're going to head down this route, I would propose that instead of building this internal to each workers, we instead explore a fully-pluggable cache system be designed that workers reach out to. (One of these systems could even be sidecars to MWorkers, perhaps.) I'd like to, for example, be able to put memcache next to a master and let it handle caching for me instead of doing it inside Salt. This gives people to re-use whatever caching layers they may already have in their infrastructures and it avoids the overhead of putting this layer directly in the workers. |
consider the extreme case where you need to deserialize 10MB of data from
an external cache in an on-demand external pillar. a cache hit is still
expensive for the worker as it pays this penalty.
if you know that you're going to pay this penalty each time the pillar is
rendered why not cache in memory?
that said, using a shared cache as a second layer makes sense and is
complimentary.
I think the ext pillar author should use whatever cache is configured for
the environment. if I want to cache in memory why shouldn't I be able to?
the pr I sent might address some of your concerns?
…On Tue, Feb 19, 2019, 13:16 Mike Place ***@***.***> wrote:
If we're going to head down this route, I would propose that instead of
building this internal to each workers, we instead explore a
fully-pluggable cache system be designed that workers reach out to. (One of
these systems could even be sidecars to MWorkers, perhaps.)
I'd like to, for example, be able to put memcache next to a master and let
it handle caching for me instead of doing it inside Salt.
This gives people to re-use whatever caching layers they may already have
in their infrastructures and it avoids the overhead of putting this layer
directly in the workers.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABgWHAyZQIM27XcHcLwoP7xBNiIbvMfVks5vO_lBgaJpZM4a-z1f>
.
|
@johnskopis I guess I'm not clear on the argument for an in-memory cache in the worker itself versus a caching layer external to the process that could also be memory-based. Writing a robust caching layers can be a lot of effort and I am just trying to understand the case for doing that internal to Salt when many good options could be taken off the shelf and used alongside. |
rather than going around in circles here I think it might make sense to quantify the (de)serialization overhead I'm concerned about. how long does it take to read 10mb of json (or yaml) from redis vs read 10mb pyobject from memory? |
@johnskopis That's a good question! Have you done the comparison? I know I'm certainly interested in seeing/trying out a test case. |
Probably a contrived example but is meant to emphasize the penalty: An aside, I do a similar exercise every few years and am always a little surprised by the results.
I was thinking about this some more and I think it's more than just performance. As a salt user, If I want to use a memory cache I should be able to. Why should I have to setup a cache infrastructure if I just want to use memory as my backend? I can definitely appreciate not wanting to have a cache embedded inside salt because there are certainly things built that do a much better job. I don't think there's any maintenance cost to use cachetools implementation. What is the reason not to be flexible and allow users to choose whichever cache backend they want? |
@johnskopis There seems to be some confusion about what I'm suggesting. Any in-memory cache that's shipped with Salt is just fine. My point is that I don't think that cache should exist internal to the worker. For example, if each worker is maintaining its own cache, how do you ensure cache consistency across the fleet of MWorkers? Why spend memory for each worker to cache duplicate data? If there's a |
@cachedout I opened the pr because it's not possible to cache an on demand
pillar using an in memory caché. also the cache api does not support IMS.
if you have a better option I'm open ears.
to your point, memory is cheap and if I want to burn it I should be able
to. what are the other options? IMS caching solves consistency issues...
what do you suggest we do instead?
…On Fri, Mar 1, 2019, 16:43 Mike Place ***@***.***> wrote:
@johnskopis <https://github.com/johnskopis> There seems to be some
confusion about what I'm suggesting. Any in-memory cache that's shipped
with Salt is just fine. My point is that I don't think that cache should
exist *internal to the worker*. For example, if each worker is
maintaining its own cache, how do you ensure cache consistency? Why spend
memory for *each* worker to cache duplicate data?
If there's a SaltCache worker that workers can query, that would satisify
both concerns. It's not a binary choice between an external system and an
in-memory system that works with Salt but rather, these can be quite
complimentary.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABgWHEY4Pzuf8AxmVk_NSMfPI47dXP3Qks5vSVi8gaJpZM4a-z1f>
.
|
@johnskopis Caching in the way that you are proposing (internal to the master worker processes) does not solve consistency issues -- it creates them. If you have twenty worker processes, each caching data independently in memory, you have twenty different caches operating at twenty different levels of freshness. What happens when the source pillar data changes? In order to ensure that you don't have some caches expiring before others, you have to be able to ensure that they all have the same copy of the cached data, lest you have a cache coherence problem, which in the case of a pillar would be Very Bad News (tm). I have proposed multiple ways of solving this problem in a way that will work, cache data in-memory, while avoiding the cache coherence problem that I have just described. Please re-read my comments if they aren't clear. |
the thinking is that the worker cache stores some meta data about the cache
object, namely the etag, in the cache.the getter function takes a
previously cached item as an argument, which it uses to perform a
conditional GET of the underlying resource.
caching immutable objects indefinitely or revalidation (conditional get) in
every fetch is the *only* way to ensure consistency.
using a single cache to serve inconsistent, ie stale, data doesn't address
consistency but it does make it less of a concern.
I want to be able to revalidate cached objects and I want to do it inside
the worker process because that's where the `ext_pillar` runs.
I can't cache inside my ext pillar because that module gets reloaded.
The functionality I described in this rfc is meant to address this exact
problem. The only thing you have suggested is to not do this inside the
worker process.
Following the same pattern, the user's ext_pillar should not exist in the
worker process at all. instead the ext_pillar should be dumbed down to a
GET request to an external service with the expectation that users deploy
an ext_pillar server side car next to their salt master. I briefly consider
this in the proposal under the use an external solution...
is this accurate?
…On Sat, Mar 2, 2019, 02:08 Mike Place ***@***.***> wrote:
@johnskopis <https://github.com/johnskopis> Caching in the way that you
are proposing (internal to the master worker processes) does *not* solve
consistency issues -- it creates them.
If you have twenty worker processes, each caching data independently in
memory, you have fifty different caches operating at fifty different levels
of freshness. What happens when the source pillar data changes? In order to
ensure that you don't have some caches expiring before others, you have to
be able to ensure that they all have the same copy of the cached data, lest
you have a cache coherence problem, which in the case of a pillar would be
Very Bad News (tm).
I have proposed multiple ways of solving this problem in a way that will
work, cache data in-memory, while avoiding the cache coherence problem that
I have just described. Please re-read my comments if they aren't clear.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABgWHATtJOWxTMxLJ1R5i6bukdZxodjRks5vSd0PgaJpZM4a-z1f>
.
|
Ah, I see. This sounds, though, like it's operating under the assumption that all external pillars call an HTTP endpoint that can support conditional fetching. While HTTP supports conditional GETs, there is no guarantee that the external pillar uses HTTP at all. How do you avoid cache coherence problems across multiple workers without the guarantee that you have the ability to conditionally fetch the object to begin with? |
the best option is to store immutable objects in cache and have them evict
when the cache is full.
there's nothing wrong with using a shared ttl-based cache. in this scenario
the worker cache would be configured to use the shared cache only.
with a tiered L1(memory) and L2(shared) cache you should be able to just
use the L2 cache. I didn't spend a lot of time here but since the tiered
cache should "just work" (TM): `cache.l2.cache(...).
I'm trying to be flexible and build stuff that works for everyone's
subjectively "best option" because some times the objectively best option
isn't practical.
…On Sat, Mar 2, 2019, 11:19 Mike Place ***@***.***> wrote:
Ah, I see. This sounds, though, like it's operating under the assumption
that all external pillars call an HTTP endpoint that *can* support
conditional fetching. While HTTP supports conditional GETs, there is no
guarantee that the external pillar uses HTTP at all. How do you avoid cache
coherence problems across multiple workers without the guarantee that you
have the ability to conditionally fetch the object to begin with?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABgWHHh8AXIKyvmrC2tmlbM8ZFNnIOa1ks5vSl5DgaJpZM4a-z1f>
.
|
I fear this is moving toward an unhelpful and pedantic argument, so let me just make this my last post and attempt to better explain my position here. Beyond this, I'll leave it to @saltstack/team-core and others to respond and give guidance about the best way to proceed here.
To achieve coherence across multiple in-memory caches in the manner that you are proposing, there are a few ways one could go, none of them easy. You could implement a variation on write-back or write-through strategies. You've mentioned L1 caches, alluding to processor caches and many processor architectures handle the problem in this manner. In short, you'd have to design and implement a system by which each worker is updated about cached objects on other workers. Alternately, a request could force all workers to query the upstream system as a means to try to stay in sync but this is a bit of a scary way to implement coherence and beyond that, it totally breaks Salt's model of MWorker autonomy which is almost certainly a non-starter with Salt's core team. The other option is the one I have advocated for during this conversation -- a type of shared memory approach. In my proposal, instead of having a literal segment of shared memory, you would instead have a single process providing caching services, acting as a singular point of truth for cached data from which workers could poll. It would be fine, as you rightly point out, to have a TTL and/or an IMS strategy in this layer. The key point is that all workers would need to query this layer which would provide consistent answers to workers on each request. This provides the added benefit of allowing a pluggable system by which hardened and performant such as memcache or Redis could be used. Another option would be to put caches on workers but to use cache files on disk. This is not in-memory, as you describe as your goal above, but it would solve the problem and likely still be extremely performant as compared to a remote endpoint. There still needs to be consideration for file locking which would require special attention but Salt has facilities for handling that gracefully. The tl;dr here is that I am all for a shared, in-memory cache that ships with Salt and can be used by workers but in-memory pillar caching should, not be placed inside the worker itself. Hopefully that makes sense. I'm disengaging from this thread now and will leave it to others to follow-up. Cheers. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're re-starting numbering with the SEP process, so this SEP will be number 2, so should be 0002-worker-cache.md
.
@@ -0,0 +1,210 @@ | |||
- Feature Name: `worker_cache` (MWorker Cache / If-Modified-Since Cache IMSCache) | |||
- Start Date: 2019-02-13 | |||
- RFC PR: (leave this empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- RFC PR: (leave this empty) | |
- RFC PR: https://github.com/saltstack/salt-enhancement-proposals/pull/3 |
So I'm not intimately familiar with all of the ins-and-outs of caching. I've experienced enough to know that here be dragons though 🙂
That sounds sensible to me. I like both the idea of having a cache, and it sounds like a bad idea to be placing it within the worker. Are you interested in incorporating that feedback and updating your proposal? |
Hey @johnskopis are you interested in working on this, or should we go ahead and mark it as abandoned for someone else to pick up in the future? |
Feel free to close as abandoned. Thanks
…On Thu, Sep 5, 2019 at 10:31 PM Wayne Werner ***@***.***> wrote:
Hey @johnskopis <https://github.com/johnskopis> are you interested in
working on this, or should we go ahead and mark it as abandoned for someone
else to pick up in the future?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3?email_source=notifications&email_token=AAMBMHCFYZSSAIDZNT6JZW3QIF3CPA5CNFSM4GX3HVP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6A4BFQ#issuecomment-528597142>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAMBMHGIFVIMN5FUL2XDF23QIF3CPANCNFSM4GX3HVPQ>
.
|
No description provided.