-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
S3: implement new checksum algorithm CRC64NVME add ChecksumType for PutObject #12182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
S3 Image Test Results (AMD64 / ARM64) 2 files 2 suites 5m 28s ⏱️ Results for commit 13a19bce. ♻️ This comment has been updated with latest results. |
LocalStack Community integration with Pro 2 files ±0 2 suites ±0 1h 51m 41s ⏱️ + 1m 11s Results for commit 13a19bce. ± Comparison against base commit 1790318. This pull request removes 15 and adds 17 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
@@ -261,7 +261,7 @@ def test_sfn_start_execution_implicit_json_serialisation( | |||
) | |||
# it seems the SFn internal client does not return the checksum values from the object yet, maybe it hasn't | |||
# been updated to parse those fields? | |||
@markers.snapshot.skip_snapshot_verify(paths=["$..ChecksumCrc32"]) | |||
@markers.snapshot.skip_snapshot_verify(paths=["$..ChecksumCrc32", "$..ChecksumType"]) | |||
def test_s3_get_object( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this added to LocalStack already 🚀!
I’m leaving this comment to figure out whether this is something we should fix in the SFN provider, though I’m otherwise fine with skipping these response fields in the tests.
My understanding is that, contrary to the previous comment, LocalStack’s S3 client returns two additional checksum fields in its getObject
response compared to AWS. Because this response is dispatched via the AWS SDK integration, which primarily normalizes response keys, I can see the extra fields even before normalization in localstack.services.stepfunctions.asl.component.state.state_execution.state_task.service.state_task_service_aws_sdk.StateTaskServiceAwsSdk._eval_service_task
. The same applies for the test case below aws.services.stepfunctions.v2.services.test_aws_sdk_task_service.TestTaskServiceAwsSdk.test_s3_put_object
.
In these scenarios, should s3 exclude the checksum values in its responses, or should we look into modifying the aws-sdk Step Functions integration to remove these fields from the responses instead? Perhaps this S3 change hasn’t yet been reflected in Step Functions through the aws-sdk, so we might consider keeping an eye on the state of this test case going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the difference is coming from how the internal SDK handles the request. I believe this is due to how the default values for Boto changed:
- when creating an object, boto will automatically adds the checksum algorithm, thus the object being created (which will show up in the response). This will be for
PutObject
. I'll update the snapshot right now to snapshot the fullGetObject
call to see how the object is being created. Update: It seems I've updated the snapshot again and now they are setting the Checksum 😂 the only issue we now have is a casing issue, but it seems SFN is now creating the object with checksums ✅ - when getting an object, boto will automatically set the
ChecksumMode
field set toENABLED
, but I'm not sure we can disable that. That way, it returns the 2 fields, which I don't think the current SFN SDK does. ForGetObject
, we can be sure the object is created with checksums, as the call is done from boto, but it seems the SDK does not return those, probably because it does not setChecksumMode
. We could manually remove those fields from the response for now.
TLDR;
For PutObject
, after refreshing the snapshots again, it seems like the only issue remaining is casing
For GetObject
, we may need to remove the fields for now
Motivation
This PR implements the new checksum algorithm for S3,
CRC64NVME
, and add theChecksumType
field for object created withPutObject
. TheChecksumType
is not too relevant forPutObject
, as this is a feature mostly related to Multipart Uploads.More information about the new S3 data integrity push AWS has been doing can be read about here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html
This PR also fix a bit of parity reported in #12187, where we would report the wrong exception if the checksum was wrong, as the previous tests were sending values that would not correspond to a valid checksum.
Changes
CRC64NVME
ChecksumType
field forPutObject
and operations retrieving objectsfixes #12187