2021-05-03 Idempotence ideas

From Izara Wiki
Jump to navigation Jump to search

Record of requests

It seems the standard way of handling idempotence is to keep a record of a unique identifier for each request, eg the AWS request id. This would protect against a function being called twice if it reached completion, but would not clean up a request that halted half way through.

Re-process multiple requests

If we can code all sections of a Lambda to not alter existing previously processed data/outputs then we could re-process an entire Lambda, only actioning outputs that were not processed in a previous invocation.

DynamoDB CRUD

Try to design all queries that adjust data to not make any changes to existing data that might have been already processed.

Conditional expressions

PutItem can add a ConditionExpression with attribute_not_exists so if a record already exists nothing happens.

Update query could have a ConditionExpression that tests for a value, such as a timestamp, to make sure the data being updated is newer than the existing version, that timestamp would need to be sent with the request so that it does not change per invocation.

Delete is probably idempotent already, although if there is the possibility of a record being deleted then re-added a race condition could occur, a timestamp ConditionExpression could also be used to protect against this, to check that the time the request to Delete was created is after the existing records creation timestamp.

S3

.. I understand we could add an attribute, eg timestamp also

neo4j

Most data will be immutable, we could use timestamps to ensure any changes only occur if newer than existing data.

Other unused ideas

elasticache or other cache service

  • does it have to be in vpn, if yes consider all lambda's in vpn
  • have middleware that has params: useIdempCache (whether function uses cache), and timeToLive (how long before clean up cache entries)
  • prob need global object that uses settings and has methods for checking and adding cache
  • middleware creates a hash of the request and correlation ids (probably specific ones because retry messages might make changes) to identify each message
  • is there a message id we can use that persists over retries?
  • aws request id is not enough in case same request sends multiple messages to this lambda, might even send multiple same messages, unlikely but try to find a unique message id to allow for this
  • use the hash to mark each cache entry, added to the lambda function name so we can share one cache with all lambdas in one service
  • when about to do any activity, CRUD, send message, check the case for that request and step, if exist skip it
  • if is computationally intensive logic and high chance of doubled up action could do a check before the logic

There is the chance of race condition between checking the cache, performing the action, and saving the cache (important time is between checking the cache and performing the action, because updating the cache would be redundant in a race condition. Not sure how to mitigate this, but maybe:

  • still have logic level idempotent code, eg Conditional Statements
  • perhaps have a start cache entry and an end cache entry, perhaps add a unique id generated in this instance so can differentiate double invocations
  • maybe whenever checking the cache also check if there is another process that started, be more careful there
  • maybe whenever checking the cache check if there is another request that ended, if find that can exit processing and not retry
  • perhaps check if another ended first, if yes exit, if no then check another request started? If yes stop this request and place in retry queue

Split code into many Lambdas

Idea is that if each section of code is a Lambda then impotence considerations only have to be contained within that small section of code, eg when re-processing a function that halted half way less code is re-processed.

Deciding against this because all code still needs to be designed to be idempotent, and re-processing should be reasonably rare, the additional resource use spreading across many functions would be expensive for little to no gain.

Might also mean multiple requests being passed on to subsequent Lambdas, because when a Lambda fails we have no way of knowing if it already trigger the next task, so need to send again (I believe equivalent to larger Lambdas which would need to process everything again?)