Communication between services: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 5: | Line 5: | ||
= Communication between backend services = | = Communication between backend services = | ||
== | == In and Out topics == | ||
* Each Amazon account has a limit of 100 SNS | * Each Amazon account has a limit of 100,000 SNS topics per account, and 200 subscriptions filter policies | ||
* reference: https://www.jeremydaly.com/how-to-use-sns-and-sqs-to-distribute-and-throttle-events/ | * reference: https://www.jeremydaly.com/how-to-use-sns-and-sqs-to-distribute-and-throttle-events/ | ||
=== | === One SNS topic per action === | ||
Because of the very small filter policy limit in AWS we should avoid using them, we could either have shared topics that handle many messages which trigger all subscribers and those subscribers must filter which messages they process, or we can setup separate SNS topics for each message type. | |||
=== | We will try to separate SNS topic design, it will result in a large number of SNS topics but will reduce SQS messages and Lambda invocations. | ||
=== Naming the topics === | |||
Name each topic either "In{description}" or "Out{description}". | |||
=== When to use In or Out === | |||
* | * Sometimes we will have a parent-child relationship between services, the parent service is deployed first, and other child services that depend on it are deployed later, in this situation the child services might use the parents Out topic to receive messages from the parent service, and the parents In topic to send messages to the parent | ||
* | * Use In when expect a large number of services will be sending requests to one Lambda, eg ComplexFilter'ing | ||
* | * Use Out when a large number of services will be waiting for response of one Lambda (often filtered to only receive responses that originated from the receiving Lambda) | ||
* often they are paired in a flow, the request in goes to an In topic in one service, the output goes into an Out topic in the same service | |||
* Also consider which service it is more logical to know the endpoint, if the sending service is more logical then it will send to a receiving service's In. If receiving service is more logical it subscribtes to a sending service's Out | |||
* Another way to decide is to consider whether the message is a notification of some work finishing (place in own service's Out queue), or is a task being sent to a specific place to continue processing (place in other service's In queue) | |||
=== | === Deploying/changing queue subscriptions === | ||
* | * Receiving service subscribing a local SQS to an external Out topic might be tricky, we can hardcode subscribing in initial setup for now | ||
* | * Config table could have logic that handles changing subscriptions | ||
* | * When service sends messages to other service's In queue can use the topic's name, built from a record in sending service's Config table, usually only the ServiceName is needed in the Config table and the topic's descriptive name is added to this to get the full topic's name | ||
=== Serverless framework deploying queues === | === Serverless framework deploying queues === | ||
(we will be using very few filters now) | |||
* First time deploy will get an error on SNS subscription if it has filters, need to comment them, then can do a second deploy with them added | * First time deploy will get an error on SNS subscription if it has filters, need to comment them, then can do a second deploy with them added | ||
Line 45: | Line 43: | ||
== Direct Lambda invocations == | == Direct Lambda invocations == | ||
* Not as common as | * Not as common as In/Out queue flow | ||
* The queue flow ensures safer delivery of requests, can be paused (accumulate messages that come in), and can hook in additional services to monitor the messages | * The queue flow ensures safer delivery of requests, can be paused (accumulate messages that come in), and can hook in additional services to monitor the messages | ||
* For flows that need to be heavily | * For flows that need to be heavily optimized is a possibility | ||
* Sometimes is useful for | * Sometimes is useful for synchronous requests can be used, eg Authorizer that gets called on each API Gateway request that checks RBAC | ||
== Direct send to SQS queue that triggers Lambda == | == Direct send to SQS queue that triggers Lambda == |
Latest revision as of 14:28, 17 April 2021
Overview
How to pass tasks between services
Communication between backend services
In and Out topics
- Each Amazon account has a limit of 100,000 SNS topics per account, and 200 subscriptions filter policies
- reference: https://www.jeremydaly.com/how-to-use-sns-and-sqs-to-distribute-and-throttle-events/
One SNS topic per action
Because of the very small filter policy limit in AWS we should avoid using them, we could either have shared topics that handle many messages which trigger all subscribers and those subscribers must filter which messages they process, or we can setup separate SNS topics for each message type.
We will try to separate SNS topic design, it will result in a large number of SNS topics but will reduce SQS messages and Lambda invocations.
Naming the topics
Name each topic either "In{description}" or "Out{description}".
When to use In or Out
- Sometimes we will have a parent-child relationship between services, the parent service is deployed first, and other child services that depend on it are deployed later, in this situation the child services might use the parents Out topic to receive messages from the parent service, and the parents In topic to send messages to the parent
- Use In when expect a large number of services will be sending requests to one Lambda, eg ComplexFilter'ing
- Use Out when a large number of services will be waiting for response of one Lambda (often filtered to only receive responses that originated from the receiving Lambda)
- often they are paired in a flow, the request in goes to an In topic in one service, the output goes into an Out topic in the same service
- Also consider which service it is more logical to know the endpoint, if the sending service is more logical then it will send to a receiving service's In. If receiving service is more logical it subscribtes to a sending service's Out
- Another way to decide is to consider whether the message is a notification of some work finishing (place in own service's Out queue), or is a task being sent to a specific place to continue processing (place in other service's In queue)
Deploying/changing queue subscriptions
- Receiving service subscribing a local SQS to an external Out topic might be tricky, we can hardcode subscribing in initial setup for now
- Config table could have logic that handles changing subscriptions
- When service sends messages to other service's In queue can use the topic's name, built from a record in sending service's Config table, usually only the ServiceName is needed in the Config table and the topic's descriptive name is added to this to get the full topic's name
Serverless framework deploying queues
(we will be using very few filters now)
- First time deploy will get an error on SNS subscription if it has filters, need to comment them, then can do a second deploy with them added
Direct Lambda invocations
- Not as common as In/Out queue flow
- The queue flow ensures safer delivery of requests, can be paused (accumulate messages that come in), and can hook in additional services to monitor the messages
- For flows that need to be heavily optimized is a possibility
- Sometimes is useful for synchronous requests can be used, eg Authorizer that gets called on each API Gateway request that checks RBAC
Direct send to SQS queue that triggers Lambda
- If communication is part of a single unit of work and we want to be more efficient, are sure we only ever want one Lambda to receive the request, we could skip SNS queue and place directly on SQS for the Lambda
Return values from Lambda functions
These are managed in the handler Lambda function, return will be different depending on the requesting resource type:
API Handlers
- Use response function that sets up StatusCode/Headers/Body
- I believe the entire response object is placed into a “body” parameter of the response sent to client
- API Gateway adds its own status code, this is always 200 if the Lambda function was invoked (even if throws error) I think, might map status code from response now…
Direct invoke
- If invoked async response is discarded, but some handlers will receive requests both sync and async so OK to create response objects
- Response is placed into an object with a StatusCode property (200 if Lambda invoked successfully, code likely to change if Lambda fails before invoking, eg concurrency limits)
- Response is placed into Payload property, stringified, need to json.parse
- Our middleware formats the response into (err, response) before returning
- If Lambda throws error (eg middleware validation) response is placed into err
- Try to throw errors for any unexpected, eg config/validation type errors, other expected checks should return in the response
- I believe we tested Lambda return Error object is the same as throwing an error. Callback with first argument set as Error also treated as an error
- Tested return (x,y) only returns y (last variable), maybe our middleware messing with this
- If return or throw an Error object, or callback with an error as first return value, Lambda will add a top level parameter UnhandledFunctionError, Payload is the error object
SQS triggers Lambda
When using an SQS queue to trigger Lambda functions a batch of messages can be sent, if the Lambda throws an error or times out all messages in the batch are retried, otherwise all are deleted.
Our method of managing this effectively is to return normally from the Lambda function, and any messages that fail we manually re-add them to the SQS, or send to DLQ if too many retries.
If re-adding any messages to the SQS queue fails we could throw an error from the handler function, meaning all messages get retried using the queues retry configuration, this would ensure no processes are lost but would need to ensure function is idempotent (which is always should be anyway).