-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
When creating a step function using distributed maps on localstack pro, the execution stops before the first map task. The localstack docker container shows an error indicating issues with pickling.
2023-12-05T16:45:58.482 WARN --- [d-159 (eval)] l.s.s.a.c.s.s.execute_stat : State Task encountered an unhandled exception that lead to a State.Runtime error.
2023-12-05T16:45:58.483 ERROR --- [d-159 (eval)] l.s.s.a.c.eval_component : Exception=FailureEventException, Error=States.Runtime, Details={"taskFailedEventDetails": {"error": "States.Runtime", "cause": "cannot pickle '_thread.lock' object"}} at '(StateMap| {'comment': (Comment| {'comment': 'Map over each document url and launch the document processing pipeline to extract clauses'}, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'MapStateEntered', 'state_exited_event_type': 'MapStateExited', 'result_path': (ResultPath| {'result_path_src': None}, 'result_selector': None, 'retry': None, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 99999999, 'is_default': None}, 'heartbeat': None, 'name': 'map-document-urls', 'state_type': <StateType.Map: 22>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b42d50>, 'items_path': (ItemsPath| {'items_path_src': '$'}, 'item_reader': (ItemReader| {'parameters': (Parameters| {'payload_tmpl': (PayloadTmpl| {'payload_bindings': [(PayloadBindingValue| {'field': 'Bucket', 'value': (PayloadValueStr| {'val': 'clause-library-distributed-map-state'}}, (PayloadBindingPath| {'field': 'Key', 'path': '$.documentUrlsKey'}]}}, 'reader_config': (ReaderConfig| {'input_type': (InputType| {'input_type_value': <InputTypeValue.JSON: 'JSON'>}, 'max_items': (MaxItems| {'max_items': 100000000}, 'csv_header_location': None, 'csv_headers': None}, 'resource_output_transformer': (ResourceOutputTransformerJson| {}, 'resource': (ServiceResource| {'_region': '', '_account': '', 'resource_arn': 'arn:aws:states:::s3:getObject', 'partition': 'aws', 'service_name': 's3', 'api_name': 's3', 'api_action': 'getObject', 'condition': None}}), 'item_selector': None, 'parameters': None, 'max_concurrency': (MaxConcurrency| {'num': 10}, 'iteration_component': (DistributedItemProcessor| {'_start_at': (StartAt| {'start_at_name': 'load-document-data-task'}, '_states': (States| {'states': {'load-document-data-task': (StateTaskLambda| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'TaskStateEntered', 'state_exited_event_type': 'TaskStateExited', 'result_path': None, 'result_selector': None, 'retry': (RetryDecl| {'retriers': [(RetrierDecl| {'error_equals': (ErrorEqualsDecl| {'error_names': [(CustomErrorName| {'error_name': 'Lambda.ClientExecutionTimeoutException'}, (CustomErrorName| {'error_name': 'Lambda.ServiceException'}, (CustomErrorName| {'error_name': 'Lambda.AWSLambdaException'}, (CustomErrorName| {'error_name': 'Lambda.SdkClientException'}]}, 'interval_seconds': (IntervalSecondsDecl| {'seconds': 2}, 'max_attempts': (MaxAttemptsDecl| {'attempts': 6}, 'backoff_rate': (BackoffRateDecl| {'rate': 2.0}, '_attempts_counter': 0, '_next_interval_seconds': 2}]}, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 300, 'is_default': None}, 'heartbeat': None, 'parameters': None, 'name': 'load-document-data-task', 'state_type': <StateType.Task: 15>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b47b50>, 'resource': (LambdaResource| {'_region': 'us-east-1', '_account': '000000000000', 'resource_arn': 'arn:aws:lambda:us-east-1:000000000000:function:load-document-data', 'partition': 'aws', 'function_name': 'load-document-data'}}, 'is-old-document-deleted': (StateChoice| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'ChoiceStateEntered', 'state_exited_event_type': 'ChoiceStateExited', 'default_state': (DefaultDecl| {'state_name': 'has-document-changed'}, '_next_state_name': None, 'name': 'is-old-document-deleted', 'state_type': <StateType.Choice: 16>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b45350>, 'choices_decl': (ChoicesDecl| {'rules': [(ChoiceRule| {'comparison': (ComparisonVariable| {'variable': (Variable| {'value': '$.documentMetadata.deleted'}, 'comparison_function': (ComparisonFunc| {'operator_type': <ComparisonOperatorType.BooleanEquals: 28>, 'value': True}}, 'next_stmt': (Next| {'name': 'update-document-store-sync-task'}}]}}, 'has-document-changed': (StateChoice| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'ChoiceStateEntered', 'state_exited_event_type': 'ChoiceStateExited', 'default_state': (DefaultDecl| {'state_name': 'update-document-store-sync-task'}, '_next_state_name': None, 'name': 'has-document-changed', 'state_type': <StateType.Choice: 16>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b45510>, 'choices_decl': (ChoicesDecl| {'rules': [(ChoiceRule| {'comparison': (ComparisonVariable| {'variable': (Variable| {'value': '$.documentMetadata.changed'}, 'comparison_function': (ComparisonFunc| {'operator_type': <ComparisonOperatorType.BooleanEquals: 28>, 'value': True}}, 'next_stmt': (Next| {'name': 'store-document-task'}}]}}, 'update-document-store-sync-task': (StateTaskLambda| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'TaskStateEntered', 'state_exited_event_type': 'TaskStateExited', 'result_path': None, 'result_selector': None, 'retry': (RetryDecl| {'retriers': [(RetrierDecl| {'error_equals': (ErrorEqualsDecl| {'error_names': [(CustomErrorName| {'error_name': 'Lambda.ClientExecutionTimeoutException'}, (CustomErrorName| {'error_name': 'Lambda.ServiceException'}, (CustomErrorName| {'error_name': 'Lambda.AWSLambdaException'}, (CustomErrorName| {'error_name': 'Lambda.SdkClientException'}]}, 'interval_seconds': (IntervalSecondsDecl| {'seconds': 2}, 'max_attempts': (MaxAttemptsDecl| {'attempts': 6}, 'backoff_rate': (BackoffRateDecl| {'rate': 2.0}, '_attempts_counter': 0, '_next_interval_seconds': 2}]}, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 300, 'is_default': None}, 'heartbeat': None, 'parameters': None, 'name': 'update-document-store-sync-task', 'state_type': <StateType.Task: 15>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithEnd object at 0xfffea1b45ad0>, 'resource': (LambdaResource| {'_region': 'us-east-1', '_account': '000000000000', 'resource_arn': 'arn:aws:lambda:us-east-1:000000000000:function:update-document-store-sync', 'partition': 'aws', 'function_name': 'update-document-store-sync'}}, 'map-clauses': (StateMap| {'comment': (Comment| {'comment': 'Map over each extracted clause, embed them and store them in the intermediate table'}, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'MapStateEntered', 'state_exited_event_type': 'MapStateExited', 'result_path': (ResultPath| {'result_path_src': None}, 'result_selector': None, 'retry': None, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 99999999, 'is_default': None}, 'heartbeat': None, 'name': 'map-clauses', 'state_type': <StateType.Map: 22>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b461d0>, 'items_path': (ItemsPath| {'items_path_src': '$'}, 'item_reader': (ItemReader| {'parameters': (Parameters| {'payload_tmpl': (PayloadTmpl| {'payload_bindings': [(PayloadBindingValue| {'field': 'Bucket', 'value': (PayloadValueStr| {'val': 'clause-library-distributed-map-state'}}, (PayloadBindingPath| {'field': 'Key', 'path': '$.clausesKey'}]}}, 'reader_config': (ReaderConfig| {'input_type': (InputType| {'input_type_value': <InputTypeValue.JSON: 'JSON'>}, 'max_items': (MaxItems| {'max_items': 100000000}, 'csv_header_location': None, 'csv_headers': None}, 'resource_output_transformer': (ResourceOutputTransformerJson| {}, 'resource': (ServiceResource| {'_region': '', '_account': '', 'resource_arn': 'arn:aws:states:::s3:getObject', 'partition': 'aws', 'service_name': 's3', 'api_name': 's3', 'api_action': 'getObject', 'condition': None}}), 'item_selector': None, 'parameters': None, 'max_concurrency': (MaxConcurrency| {'num': 10}, 'iteration_component': (DistributedItemProcessor| {'_start_at': (StartAt| {'start_at_name': 'store-intermediate-clause-task'}, '_states': (States| {'states': {'store-intermediate-clause-task': (StateTaskLambda| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'TaskStateEntered', 'state_exited_event_type': 'TaskStateExited', 'result_path': None, 'result_selector': None, 'retry': (RetryDecl| {'retriers': [(RetrierDecl| {'error_equals': (ErrorEqualsDecl| {'error_names': [(CustomErrorName| {'error_name': 'Lambda.ClientExecutionTimeoutException'}, (CustomErrorName| {'error_name': 'Lambda.ServiceException'}, (CustomErrorName| {'error_name': 'Lambda.AWSLambdaException'}, (CustomErrorName| {'error_name': 'Lambda.SdkClientException'}]}, 'interval_seconds': (IntervalSecondsDecl| {'seconds': 2}, 'max_attempts': (MaxAttemptsDecl| {'attempts': 6}, 'backoff_rate': (BackoffRateDecl| {'rate': 2.0}, '_attempts_counter': 0, '_next_interval_seconds': 2}]}, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 300, 'is_default': None}, 'heartbeat': None, 'parameters': None, 'name': 'store-intermediate-clause-task', 'state_type': <StateType.Task: 15>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithEnd object at 0xfffea1b44410>, 'resource': (LambdaResource| {'_region': 'us-east-1', '_account': '000000000000', 'resource_arn': 'arn:aws:lambda:us-east-1:000000000000:function:store-intermediate-clause', 'partition': 'aws', 'function_name': 'store-intermediate-clause'}}}}, '_comment': None, '_eval_input': None, '_job_pool': None, '_mutex': <unlocked _thread.lock object at 0xfffea1b45f00>, '_map_run_record': None, '_workers': [], '_processor_config': (ProcessorConfig| {'mode': <Mode.Distributed: 80>, 'execution_type': <ExecutionType.Standard: 82>}}}, 'extract-clauses-task': (StateTaskLambda| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'TaskStateEntered', 'state_exited_event_type': 'TaskStateExited', 'result_path': None, 'result_selector': None, 'retry': (RetryDecl| {'retriers': [(RetrierDecl| {'error_equals': (ErrorEqualsDecl| {'error_names': [(CustomErrorName| {'error_name': 'Lambda.ClientExecutionTimeoutException'}, (CustomErrorName| {'error_name': 'Lambda.ServiceException'}, (CustomErrorName| {'error_name': 'Lambda.AWSLambdaException'}, (CustomErrorName| {'error_name': 'Lambda.SdkClientException'}]}, 'interval_seconds': (IntervalSecondsDecl| {'seconds': 2}, 'max_attempts': (MaxAttemptsDecl| {'attempts': 6}, 'backoff_rate': (BackoffRateDecl| {'rate': 2.0}, '_attempts_counter': 0, '_next_interval_seconds': 2}]}, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 900, 'is_default': None}, 'heartbeat': None, 'parameters': None, 'name': 'extract-clauses-task', 'state_type': <StateType.Task: 15>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b41110>, 'resource': (LambdaResource| {'_region': 'us-east-1', '_account': '000000000000', 'resource_arn': 'arn:aws:lambda:us-east-1:000000000000:function:extract-clauses', 'partition': 'aws', 'function_name': 'extract-clauses'}}, 'store-document-task': (StateTaskLambda| {'comment': None, 'input_path': (InputPath| {'input_path_src': '$'}, 'output_path': (OutputPath| {'output_path': '$'}, 'state_entered_event_type': 'TaskStateEntered', 'state_exited_event_type': 'TaskStateExited', 'result_path': None, 'result_selector': None, 'retry': (RetryDecl| {'retriers': [(RetrierDecl| {'error_equals': (ErrorEqualsDecl| {'error_names': [(CustomErrorName| {'error_name': 'Lambda.ClientExecutionTimeoutException'}, (CustomErrorName| {'error_name': 'Lambda.ServiceException'}, (CustomErrorName| {'error_name': 'Lambda.AWSLambdaException'}, (CustomErrorName| {'error_name': 'Lambda.SdkClientException'}]}, 'interval_seconds': (IntervalSecondsDecl| {'seconds': 2}, 'max_attempts': (MaxAttemptsDecl| {'attempts': 6}, 'backoff_rate': (BackoffRateDecl| {'rate': 2.0}, '_attempts_counter': 0, '_next_interval_seconds': 2}]}, 'catch': None, 'timeout': (TimeoutSeconds| {'timeout_seconds': 300, 'is_default': None}, 'heartbeat': None, 'parameters': None, 'name': 'store-document-task', 'state_type': <StateType.Task: 15>, 'continue_with': <localstack.services.stepfunctions.asl.component.state.state_continue_with.ContinueWithNext object at 0xfffea1b43210>, 'resource': (LambdaResource| {'_region': 'us-east-1', '_account': '000000000000', 'resource_arn': 'arn:aws:lambda:us-east-1:000000000000:function:store-document', 'partition': 'aws', 'function_name': 'store-document'}}}}, '_comment': None, '_eval_input': <localstack.services.stepfunctions.asl.component.state.state_execution.state_map.iteration.itemprocessor.distributed_item_processor.DistributedItemProcessorEvalInput object at 0xfffed5fa2590>, '_job_pool': None, '_mutex': <unlocked _thread.lock object at 0xfffea1b43c40>, '_map_run_record': <localstack.services.stepfunctions.asl.component.state.state_execution.state_map.iteration.itemprocessor.map_run_record.MapRunRecord object at 0xfffea1d19c10>, '_workers': [], '_processor_config': (ProcessorConfig| {'mode': <Mode.Distributed: 80>, 'execution_type': <ExecutionType.Standard: 82>}}}'
On localstack web view, when examining the state execution we see this series of events...
ID | Step | Started After | Date and Time |
---|---|---|---|
1 | ExecutionStarted | 00:00:00.000 | Dec 5, 2023, 11:55:58.713 |
2 | TaskStateEntered | fetch-document-store-sync-task | 00:00:00.000 |
3 | LambdaFunctionScheduled | fetch-document-store-sync-task | 00:00:00.001 |
4 | LambdaFunctionStarted | fetch-document-store-sync-task | 00:00:00.001 |
5 | LambdaFunctionSucceeded | fetch-document-store-sync-task | 00:00:04.489 |
6 | TaskStateExited | fetch-document-store-sync-task | 00:00:04.489 |
7 | TaskStateEntered | find-document-urls-task | 00:00:04.489 |
8 | LambdaFunctionScheduled | find-document-urls-task | 00:00:04.489 |
9 | LambdaFunctionStarted | find-document-urls-task | 00:00:04.489 |
10 | LambdaFunctionSucceeded | find-document-urls-task | 00:00:09.257 |
11 | TaskStateExited | find-document-urls-task | 00:00:09.258 |
12 | MapStateEntered | map-document-urls | 00:00:09.258 |
13 | MapStateStarted | map-document-urls | 00:00:09.258 |
14 | MapRunStarted | map-document-urls | 00:00:09.258 |
15 | MapRunFailed | map-document-urls | 00:00:09.277 |
16 | MapStateFailed | map-document-urls | 00:00:09.279 |
17 | ExecutionFailed | 00:00:09.280 | Dec 5, 2023, 11:56:7.993 |
When looking at step 12 event details, we see these event details, which make sense in the context. Checking the S3 bucket, this file does exist properly
{
"name": "map-document-urls",
"input": {
"documentStoreId": "656f55fc5014ac1a06cd837a",
"documentStoreSyncId": "656f561b5014ac1a06cd83a3",
"connectionParams": {
"type": "INTERNAL"
},
"documentUrlsKey": "656f55fc5014ac1a06cd837a/656f561b5014ac1a06cd83a3/documentUrls.json"
},
"inputDetails": {
"truncated": false
}
}
But when looking at step 13 event details, we see this, which is a clear error. (The Json actually has 3 items in an array)
{
"length": 0
}
Step 14,15,16 all have event details = {}
Step 17 shows the error,
{
"error": "States.Runtime",
"cause": "cannot pickle '_thread.lock' object"
}
Expected Behavior
The s3 bucket should be read correctly, and the individual items in the s3 bucket json array should be distributed among the lambda function(s) running the next step in the step function.
How are you starting LocalStack?
With a docker-compose file
Steps To Reproduce
How are you starting localstack (e.g., bin/localstack
command, arguments, or docker-compose.yml
)
docker compose
localstack:
container_name: localstack
image: localstack/localstack-pro:latest
ports:
- "127.0.0.1:4566:4566" # LocalStack Gateway
- "127.0.0.1:4510-4559:4510-4559" # external services port range
# - "127.0.0.1:53:53" # DNS config (required for Pro)
# - "127.0.0.1:53:53/udp" # DNS config (required for Pro)
# - "127.0.0.1:443:443" # LocalStack HTTPS Gateway (required for Pro)
environment:
- DEBUG=${DEBUG-}
- PERSISTENCE=${PERSISTENCE-}
- LOCALSTACK_AUTH_TOKEN=${LOCALSTACK_AUTH_TOKEN-} # only required for Pro
- ACTIVATE_PRO=${LOCALSTACK_AUTH_TOKEN-0} # only required for Pro
- DOCKER_HOST=unix:///var/run/docker.sock
- LAMBDA_RUNTIME_ENVIRONMENT_TIMEOUT=30
volumes:
- "${LOCALSTACK_VOLUME_DIR:-./volume}:/var/lib/localstack"
- "/var/run/docker.sock:/var/run/docker.sock"
- "./clause-pipeline/localstackready:/etc/localstack/init/ready.d" # Waits for RDS to install postgres, then installs pg-vector
labels:
- logging=promtail
Client commands (e.g., AWS SDK code snippet, or sequence of "awslocal" commands)
CDK code
Environment
- OS: Macbook Pro, Ventura 13.5
- LocalStack: localstack/localstack-pro:3.0.2 , localstack/localstack-pro:latest
Anything else?
No response