Add schemas to stores #12931

giograno · 2025-07-30T10:45:14Z

Motivation

This PR introduces a heuristic to extract a schema representation for a given store class.

Context

Service providers implemented in LocalStack store their data (if they are not moto based or rely on 3rd party services) in stores (see #6444 for more details).

LocalStack serializes these classes to implement persistence. Therefore, we should treat them as public APIs and avoid breaking them. As stores evolve within the LocalStack codebase, a few issues can occur when deserializing a previous state:

A class gets renamed, and we are not able to load that module and class anymore;
One attribute in the store is removed. Therefore, the information stored there is lost.

Unfortunately, we don't have much visibility on how the structure of the stores changes over time.
For this reason, this PR introduces a heuristic to extract a schema representation from a store definition.
It allows us to reason about store changes. For instance:

If a type can't be loaded anymore, we can compare an old and a new schema version to understand which new type should be used instead.
By comparing the schema across two versions, we can detect if an attribute has been removed and come up with a migration path.

Changes

Introducing a new localstack.state.schema module that returns a schema definition from a BaseStore subclass. To achieve this goal, we heavily rely on the type hints of the store attributes. The docstring of the StoreSchemaBuilder class reports a few examples.
A few unit tests.

Closes PNX-46.

github-actions · 2025-07-30T10:55:31Z

Test Results - Preflight, Unit

22 067 tests +4 20 333 ✅ +4 6m 18s ⏱️ -2s
1 suites ±0 1 734 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 607c522. ± Comparison against base commit 2d08a27.

♻️ This comment has been updated with latest results.

github-actions · 2025-07-30T11:05:16Z

Test Results (amd64) - Acceptance

7 tests ±0 5 ✅ ±0 3m 23s ⏱️ +18s
1 suites ±0 2 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 607c522. ± Comparison against base commit 2d08a27.

♻️ This comment has been updated with latest results.

github-actions · 2025-07-30T11:36:40Z

Test Results (amd64) - Integration, Bootstrap

5 files 5 suites 2h 21m 20s ⏱️
4 980 tests 4 393 ✅ 587 💤 0 ❌
4 986 runs 4 393 ✅ 593 💤 0 ❌

Results for commit 607c522.

♻️ This comment has been updated with latest results.

coveralls · 2025-07-30T11:41:29Z

coverage: 83.13% (+31.7%) from 51.39%
when pulling a4de739 on json/schema
into 8e93d8a on main.

github-actions · 2025-07-30T11:48:06Z

LocalStack Community integration with Pro

2 files ±0 2 suites ±0 1h 42m 55s ⏱️ - 1m 27s
4 621 tests ±0 4 186 ✅ ±0 435 💤 ±0 0 ❌ ±0
4 623 runs ±0 4 186 ✅ ±0 437 💤 ±0 0 ❌ ±0

Results for commit 607c522. ± Comparison against base commit 2d08a27.

♻️ This comment has been updated with latest results.

viren-nadkarni

This is a good foundation to start with. This approach will build the schema out of how stores are declared, but unfortunately not what goes in them. Arguably that's a quality of weakly typed languages. One could say an attribute is of a certain type, but keep a different type in there, and this cannot be detected without running an exhaustive static type check.

Overall this tackles one aspect of schema comparison -- the store side. The other aspect is state side. I imagine each persisted state will have its own derivable schema which describes the actual data in it. This could then be checked against the store schema for load compatibility.

I'm curious to hear @thrau's thoughts.

viren-nadkarni · 2025-08-01T13:29:59Z

localstack-core/localstack/state/schema.py

+        module = getattr(obj, "__module__", None)
+        qualname = getattr(obj, "__qualname__", None)
+        if module and qualname:
+            return f"{module}.{qualname}"


You could use a different separator than . incase there is a need to reverse the operation

Using :: not (see 607c522)

localstack-core/localstack/state/schema.py

tests/unit/state/test_schema.py

viren-nadkarni · 2025-08-01T13:59:09Z

localstack-core/localstack/state/schema.py

+                if args:
+                    _hint[TAG_ARGS] = [self._serialize_hint(_arg) for _arg in args]
+                return _hint
+            case _:


Any thoughts how cases where stores have member functions e.g. SqsStore.expire_deleted() be handled? How could such callables be represented in the schema as changes within them can also affect store compatibility?

Nice observation. My idea was to not include these functions in the schema definition, as they won't be serialized by a JSON schema backend. Our goal with the project is indeed to serialize only data and not code anymore.

viren-nadkarni · 2025-08-01T14:08:43Z

localstack-core/localstack/state/schema.py

+TypeHint = types.GenericAlias | type
+
+INTERNAL_MODULE_PREFIXES = ["localstack", "moto"]
+"""Modules that starts with this prefix are considered internal classes and are evaluated"""
+
+
+AttributeName = str
+FQN = str
+SerializedHint = str | dict[str, typing.Any]
+
+AttributeSchema = dict[AttributeName, SerializedHint]
+"""Maps an attribute name its serialized hints"""
+
+AdditionalClasses = dict[FQN, AttributeSchema]
+"""Maps the a FQN of a class to its Attribute Schema"""
+
+TAG_TYPE = "LS/TYPE"
+TAG_ARGS = "LS/ARGS"
+"""Tags for subscribed types and their args. See ``StoreSchemaBuilder`` for examples."""


I think this can be simplified. Can TAG_TYPE and TAG_ARGS be statically defined instead of constants? This will also allow defining SerializedHint recursively.

I am not sure I fully understood the suggestion. In 7e0ba59 I made the SerializedHint recursive and switched to a StrEnum for the tags.

(cherry picked from commit aa92061)

Co-authored-by: Viren Nadkarni <[email protected]>

thrau

As discussed in Today's meeting with Gio and Viren, we've come to the agreement that the direction of generating a schema from stores is the correct approach, but using some existing standard (like avro, protobuf, ...) for the schema definition would be better, so we can leverage the ecosystem around the tools.

giograno · 2025-08-07T15:02:58Z

Good learning here. Closing for further investigation with the other standards mentioned above.

giograno added this to the Playground milestone Jul 30, 2025

giograno self-assigned this Jul 30, 2025

giograno added area: persistence Retain state between LocalStack runs semver: patch Non-breaking changes which can be included in patch releases labels Jul 30, 2025

giograno force-pushed the json/schema branch 3 times, most recently from 62ff9b1 to 5838002 Compare July 31, 2025 11:32

giograno requested review from viren-nadkarni and thrau July 31, 2025 13:12

giograno marked this pull request as ready for review July 31, 2025 13:17

viren-nadkarni reviewed Aug 1, 2025

View reviewed changes

giograno and others added 8 commits August 7, 2025 10:03

store schema

f055893

(cherry picked from commit aa92061)

wip

22cc6a1

tags and tests

a2674bf

missing type hint

6d5a656

Apply suggestions from code review

88271e1

Co-authored-by: Viren Nadkarni <[email protected]>

fetch skip attributes from the store annotations

e2ad388

Minor PR comments

ac35980

noqa for tuple

86bf0dc

giograno force-pushed the json/schema branch from 21a884c to 86bf0dc Compare August 7, 2025 08:06

giograno added 2 commits August 7, 2025 10:19

recurive annotation and using StrEnum

7e0ba59

use :: as a separator

607c522

giograno requested a review from viren-nadkarni August 7, 2025 09:26

thrau reviewed Aug 7, 2025

View reviewed changes

giograno closed this Aug 7, 2025

Uh oh!

Add schemas to stores #12931

Add schemas to stores #12931

Uh oh!

Conversation

giograno commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Context

Changes

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results - Preflight, Unit

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results (amd64) - Acceptance

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results (amd64) - Integration, Bootstrap

Uh oh!

coveralls commented Jul 30, 2025

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LocalStack Community integration with Pro

Uh oh!

viren-nadkarni left a comment

Choose a reason for hiding this comment

Uh oh!

viren-nadkarni Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

giograno Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

viren-nadkarni Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

giograno Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

viren-nadkarni Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

giograno Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

thrau left a comment

Choose a reason for hiding this comment

Uh oh!

giograno commented Aug 7, 2025

Uh oh!

Uh oh!

giograno commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading