2021-10-17 - Search Result flow

From Izara Wiki
Jump to navigation Jump to search

Service - Search Results has some tasks which must be performed by Service - Search Result (handlers), for example FindData.

RequestSearchResult

  • handler: Api, Inv
  • old code: endpoint/Main.requestSearchResult
  • can split into 2 functions if wanted, RequestSearchResult and RequestSearchResultSorted, splitting out code that has sortFields or not, if do this need to pull out shared code into library functions
  • receives request JSON, includes below params:
  1. searchType (product / sellOffer / variantProduct / ..)
  2. requestProperties (object containing browseQuantity / languageId/s / locationNodeId/s / ..)
  3. requiredData
  4. filter
  5. sortFields (optional)
  • returns to the client sortResultId, or searchResultId and searchDetailId (depending on request)
  • sortResultId: if request also includes sortFields then need to return sortResultId, final data will be at SortResult service (client polls SortResult to see if finished)
  • searchResultId and searchDetailId: no sortFields specified, final results will be pulled from SearchResultData (client polls SearchResults to see if finished)
  • also return status, in case data is ready for client immediately, or has error
  • if has error, return the error detail too

Find filter_id

  • before we can save/query SearchResultMain table we need to build searchResultId from searchType and filterMainId. filterMainId we want to match Complex Filter response, which might change from original JSON sent to Complex Filter (when we clean it), so need to do a request to ComplexFilter parse_only/requestFilterMainId to generate the correct filterMainId (ComplexFilterMain does not do any processing in that request)
  • (not yet) We could cache filter object and resulting filterMainId locally, eg in LambdaCache to make this step faster, if we expect the same filter object requests will be sent regularly
  • (not yet) We could break processing here and handle filterMainId result from ComplexFilter as a serverless flow… little bit messy because not have searchResultId yet, would need to have some temporary state to match the initial request filter object. Also consider additional costs of queues. Double charging Lambda’s might be cheaper, and code is easier to deal with

Check have existing SortResult?

  • if request has sortFields send a request to SortResult service to see if have data completed there already and not expired?
  • If no results at SortResult:
    • SortResult should save a SortResultMain record with status = "waitingForSearchResults"
    • SortResult returns status "newSortResultCreated"
    • continue with flow
  • If has result at SortResult, status "waitingForSearchResults":
    • do not need to process further
    • return sortResultId
  • If has result at SortResult, status "error":
    • do not need to process further
    • return sortResultId and status and error details
  • If has result at SortResult, status "processing":
    • do not need to process further
    • return sortResultId
  • If has result at SortResult, status "complete", but expired:
    • delete all data at SortResult
    • SortResult returns status "newSortResultCreated"
    • continue with flow
  • If has result at SortResult, status "complete" and not expired:
    • do not need to process further
    • return sortResultId

Returning a result to API client and further processing

  • If request has sortFields we can return the sortResultId to the client at this point, still need to process next stage (ProcessSearchResult) but can invoke async, not wait for response
  • If request does not have sortFields we want to prepare and return to client searchResultId and searchDetailId, which will need to wait for response from next stage (ProcessSearchResult)
  • So processing from this point is in a separate lambda (ProcessSearchResult), sortFields requests can invoke it async then quickly return sortResultId to client, stopping the execution of RequestSearchResult Lambda

ProcessSearchResult

  • handler: Inv (both sync and async)
  • old code: endpoint/Process.processSearchResult
  • receives request from RequestSearchResult lambda, pass RequestSearchResult’s full request object as argument, as well as filterMainId and sortResultId found in RequestSearchResult
  • if RequestSearchResult is a sortFields request will be invoked async and the return value is not important
  • if RequestSearchResult is not a sortFields request, will be invoked sync and will return result to RequestSearchResult
  • returns searchResultId and searchDetailId, and status, and error details it status = error
  • if request also includes sortFields then also return the sortResultId

Check request

  • for each childSearchResults
    • check if any of the childSearchResults.requiredData are found in the parents requiredData request, if no then not need to process this childSearchResults (parent complex filter will return dataIds that include any child complex filters, and parent does not need any searchresult fields from child) (what about searchResult types that have hooks, eg variantProduct, it might require the child searchResult to exist to work properly?)
    • check all passOnRequestProperties are included in parent's request, if not throw error
    • pull out sub complex filter/s, throw error if not found, possible that multiple sub complex filters exist for this childSearchType. In future I think we can accept no filter, will find all records of the childs type, maybe handle special, perhaps skipping complex filter?
  • store in childSearchIds variable ready to save into SearchResultMain

Check have existing SearchResultMain?

  • checks to see if have existing record in SearchResultMain table and not expired
  • If have record, not expired, status error, and sortFields set:
    • need to invoke function at SortResult service to update status to error, send reason for error too, save in SortResultMain table too, SortResult function returns sortResultId
    • return sortResultId, searchResultId and searchDetailId, status and error detail (and stop processing)
  • If have record, not expired, status error, sortFields not set:
    • return searchResultId and searchDetailId, status and error detail (and stop processing)
  • If have record, not expired, status complete, and sortFields set:
    • to get here means SortResult does not have completed data, but SearchResults does
    • basically want to invoke CompleteSearchResultMain function, but there is code there not needed here (eg finding parent records, because we are working from the only parent record that needs to be dealt with because sortFields is set), so maybe create a new function that is a step after this that triggers sending data to SortResult service
    • is already set to "complete" status
    • needs to pass to SortResults to do its processing
    • not need to add a complete message to message out topic for this SearchResultMain, because not completed at this time (was already complete)
    • return value is not that important because is processing sortFields so probably async
    • return searchResultId and searchDetailId (and stop processing)
  • If have record, not expired, status complete, and sortFields not set:
    • I think cannot invoke CompleteSearchResultMain function because if is a child and new parent is waiting to be processed we can't trigger parent now, parent has not yet set child id links, need to check that later after parent receives response from this function
    • not need to add a complete message to message out topic for this SearchResultMain, because not completed at this time (was already complete)
    • return searchResultId and searchDetailId (and stop processing)
  • If have record, not expired, status processing
    • further work will happen automatically when (already started) other stages finish
    • not important if sortFields set, if set then return value probably not used
    • return searchResultId and searchDetailId (and stop processing)
  • If has record, status complete, but expired:
    • delete existing dynamoDB data (all tables)
    • continue
  • If no record
    • continue

Insert into SearchResultMain

  • do this as quickly as possible after checking has existing SearchResult, to protect against double processing (multiple same requests received in quick succession) - or use idempotence techniques to protect against double processing
  • add to requiredData any parentDataIdentifierFields not already included
  • insert record into SearchResultMain table
  • status = "processingFilter"
  • childSearchIds = childSearchIds found above
  • (idempotence) what happens if an existing record already exists? Any error returned by Dynamo? If yes then we could detect this and stop (should mean another invocation already started processing for this record?)

send request to process child SearchResult

  • lib function createChildSearchResultRequest
  • for each childSearchIds, use Config > searchType Config to standardize creating child requests requiredData:
    • searchType comes from Config .. > childSearchResults > childSearchType
    • Config table childSearchResults .. > "requiredData" builds the child's requiredData
    • Config table childSearchResults .. > "passOnRequestProperties" builds the child's requestProperties
  • Direct invoke RequestSearchResult lambda, synchronous
  • might be at another service/deployment, so need to build from searchResultsServiceName
  • wait for the searchResultId and searchDetailId result
  • (not yet) We could break processing here and handle request as a serverless flow… consider additional costs of queues, double charging lambda’s might be cheaper and cleaner code

Update SearchResultParents

  • for each childSearchIds add record to SearchResultParents as their sync function returns
  • want to do as fast as possible to catch have record waiting for when child searchResult completes
  • if SearchResultParents record already exist not need to create/update
  • if already exists and status complete I think not an issue because when parent complex filter completes it will trigger a check
  • If child's RequestSearchResult returns error
    • set this SearchResultMain to status error and save reason (child SearchResult failed etc..)
    • If sortFields set also send message to SortResult service to update its Main record to status “error”, with reason

send SNS message to process complex filter

  • Send complex filter depending on searchType Config
  • send to message in SNS topic for complex filter service
  • send SNS message to complexFilterServiceName
  • do this after saving SearchResultMain and SearchResultParents so sure that data is saved before we get complex filter response

ProcessComplexFilterComplete

  • old code: ProcessComplexFilterComplete/Main.handler
  • subscribes to all ComplexFilter’s message out SNS topics
  • receives the filterMainId and filterType of complex filter from ComplexFilter message
  • use the {searchType=filterType}_{filterMainId} to query SearchResultMain for matching partition key’s, might have multiple records
  • Only process SearchResultMain's that SearchResultMainStatus = "processingFilter"
  • Copy results from ComplexFilter to SearchResultData, set SearchResultDataStatus to "processingRequiredData" for all SearchResultData records (pagination?)
  • for all SearchResultMain.childSearchIds get record from SearchResultParents and check status
  • if any child not yet complete set this SearchResultDataStatus to "waitingChildSearchResult"
  • else If all searchResultParentStatus = "complete":
    • if Config has hookBeforeFindDataServiceName:
      • set this SearchResultMainStatus to = "waitingHookBeforeFindData"
      • message to hookBeforeFindData at that serviceName
    • else:
      • set this SearchResultMainStatus to = "processingData" (conditionalExpression: SearchResultMainStatus = "processingFilter")
      • if conditionalExpression not pass maybe just return/stop?
      • message to ProcessFindRequiredData

HookBeforeFindDataComplete

  • subscribes to all external hookBeforeFindDataServiceName complete topics
  • uses SearchResultMain identifiers in message to find SearchResultMain record
  • set this SearchResultMainStatus to = "processingData"
  • message to ProcessFindRequiredData

ProcessFindRequiredData

  • old code: ProcessFindRequiredData/Main.handler and also SearchResultHandler..FindData functions
  • needs pagination
  • receives: SearchResultMain primary key fields
  • for each SearchResultData record iterate each requiredData field
  • save into RequiredData if not already have
  • save into RequiredDataSearchResultData
  • check if RequiredData already exists, status, and expiry time, handle appropriately (similar to existing SearchResultMain above)
  • at end of function need to send message to CheckSearchResultDataComplete in case all requiredData already complete and should continue

if need to initiate finding the data

  • might be some fields can be found immediately (standardized processing) but most will need to send async requests to other services
  • if process immediately, can set SearchResultData record SearchResultDataStatus to "invalid" if unable to complete this record. Invalid records do not get returned as results (not need to send to SortResult service)
  • fields send to other services async:
  • requiredDataStatus set to "waitingExternalService"
  • (I think this will be handled by external service now) some requiredData will have additional information, or shared details, eg the list of SellOffer price combinations waiting for price results. These can be saved in other fields in requiredData
  • send request to external service:
    • may need to include details from SearchResultMain/SearchResultData/RequiredData
    • Aggregate fields like product/variant pricing will need to send Config for this searchType, at least childSearchResults searchResultsServiceName so can query ParentDataId table and then RequiredData table to get all child prices

External Service FindData examples

SellOffer FindData

  • in SellOfferManager service

fields maxPrice / minPrice / convertedMaxPrice / convertedMinPrice

  • parse selloffer complex filter json to find if shippingServiceMainId or paymentMethodMainId set, if they are set then only check for these in next steps (if no filter then test all available ship service/payment methods
  • want to pull all ship service and payment method options available for this selloffer
  • use locationNodeId and browseQuantity to get price for all ship service and payment method combinations, use these to generate requiredData. If multiple locationNodeId set for shipTo filter then request all combinations for each locationNodeId
  • Use external sellOfferPrices service to request calculated price, that service will cache calculated prices so subsequent requests are efficient
  • Only need to send combination requests to sellOfferPrices once, so when going through requiredData loop have a variable that checks if any of the price fields are included, if yes then send out sellOfferPrices after the loop (or one time in the loop and set a switch to not send again)
  • for each combination we save a record into ...(need to create new table for storing results, try to save in a shared cache method and check before sending request to sellOfferPrices)...

ProcessSellOfferPriceResult

  • subscribes to SellOfferPrices and receives all complete price calculations
  • searches table in SellOfferManager that is managing FindData requests for prices
  • checks if all sellOfferPrices records are complete for any FindData requests, if complete process the field and send message to be recieved by ProcessFindDataResult

fields: quantityAvailable / productId

  • straight from existing SellOffer data

Product FindData

  • Only starts this process once child selloffer SearchResult Data records are all complete, and selloffer data linked to the product via ParentDataId
  • if no selloffer records are found set this SearchResultData.SearchResultDataStatus to "invalid"

fields: quantityAvailable

  • go through each SellOffer search result and add up all quantityAvailable's

fields: maxPrice / minPrice / convertedMaxPrice / convertedMinPrice

  • find the max and min from all SellOffer SearchResults

field: main_image

  • from MediaLinks

field: variantIds

  • find all matching variant ids, save/send as array
  • every product FindData will find all variantIds to save into ParentDataId table

field: productAttribute_XXX

  • XXX is the productAttributeLabelId searching for
  • one productAttributeLabelId can find multiple productAttribute results, save result/s as an array
  • languageId for finding suggested translation comes from languageId set in initial request or defaults to english

variantProduct FindData

  • Only starts this process once child Product SearchResultData records are all complete
  • Set as invalid if no product child data found

field: quantityAvailable

  • go through each Product search result and add up all quantityAvailable's

fields: maxPrice / minPrice / convertedMaxPrice / convertedMinPrice

  • find the max and min from all Product SearchResults

field: main_image

  • from product mediaLink

field: productAttribute

find all matching product attribute values for this variant, save as array

ProcessFindDataResult

  • subscribe to message_out queues for external services
  • find matching RequiredData
  • update RequiredData.requiredDataValue and requiredDataStatus = "complete" or "error"
  • find all SearchResultData records waiting for this RequiredData using RequiredDataSearchResultData records and:
    • check if SearchResultDataStatus = "processingRequiredData", if not can skip
    • invoke CheckSearchResultDataComplete for each SearchResultData record

CheckSearchResultDataComplete

  • ? old code: app/CompleteFindRequiredData/Main.js > completeSearchResultMain
  • receives one SearchResultData identifier
  • check if SearchResultDataStatus = "processingRequiredData", if not can stop processing
  • on big result sets this method of checking would be inefficient, could have another table that records which RequiredData waiting, delete from that table each time record is completed, check if any remain to test digging complete
  • check any RequiredData set to "waitingExternalService", if yes can stop processing
  • update SearchResultData SearchResultDataStatus, depends on requiredDataStatus', if all "complete" then SearchResultDataStatus is "complete", if any "invalid" or "error" then "error"
  • if finished this SearchResultData, send message to CheckSearchResultMainComplete

CheckSearchResultMainComplete

  • old code: CompleteFindRequiredData/Main.handler
  • invoked whenever all requiredData for one SearchResultData record complete
  • split into a separate lambda because might be resource intensive checking large Data sets
  • check if any Data records for this Main record remain to dig data
  • could scale this by doing multiple invocations, saving in SearchResultMain where the processing is up to, ending if find any SearchResultData not complete, could leave the lastEvalRow value for future invocations to overwrite or use (because checked complete up to there already)
  • if any SearchResultData not finished, stop processing
  • if Config has hookAfterFindDataServiceName:
      • set this SearchResultMainStatus to = "waitingHookAfterFindData"
      • message to hookAfterFindData at that serviceName
    • else:
      • message to CompleteSearchResultMain

HookAfterFindDataComplete

  • subscribes to all external hookAfterFindDataServiceName complete topics
  • uses SearchResultMain identifiers in message to find SearchResultMain record
  • message to CompleteSearchResultMain

CompleteSearchResultMain

  • ? old code: app/CompleteFindRequiredData/Main.js > completeSearchResultMain
  • find SearchResultMain record
  • Send message to SortResult service, which checks for any matching SortResultMain entries that are status = "waitingForSearchResults", if it finds any that triggers the SortResult process
  • update searchResultMainStatus to "complete" (conditionExpression: searchResultMainStatus != complete)
  • if conditional not pass stop/return
  • send message to SearchResultComplete topic

ChildSearchResultMainComplete

  • subsribes to all SearchResultComplete topics for all possible childSearchTypes
  • query SearchResultParents for all parent records that have this SearchResultMain record set as child and searchResultParentStatus = "processing"
  • For each parent:
    • get searchResultMain record
    • if SearchResultMainStatus = "processingFilter", do nothing
    • if SearchResultMainStatus = "waitingChildSearchResult", set to "processingData" and invoke ProcessFindRequiredData for parent Main
    • if SearchResultMainStatus = "complete" or "processingData", that should never happen, could perhaps log a debug/warning
  • update searchResultParentStatus = "complete"