Service - Search Results: Difference between revisions

From Izara Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(8 intermediate revisions by one other user not shown)
Line 21: Line 21:
configValue: {
configValue: {
complexFilterServiceName: "xx" // {service name of complex filter that handles this type}
complexFilterServiceName: "xx" // {service name of complex filter that handles this type}
complexFilterType: "xx" // optional, if complexFilterType is different to searchType
dataHandlerServiceName: "xx" {service name used to create function/queue names for finding data results, often a Manager service for the object}
dataHandlerServiceName: "xx" {service name used to create function/queue names for finding data results, often a Manager service for the object}
requireRequestProperties: {
requireRequestProperties: {
Line 32: Line 33:
],
],
childSearchResults: {
childSearchResults: {
{childSearchType}: { // searchType of child searchResult
{childSearchType}: { // searchType of child searchResult, matches filterType for building child complex filter
searchResultsServiceName: "xx", // allows for results to be saved by different deployed services
requiredData: { // requiredData we request from the child searchType depending on what requiredData fields are set in the parent's request
requiredData: { // requiredData we request from the child searchType depending on what requiredData fields are set in the parent's request
{parent requiredData fieldName}: [
{parent requiredData fieldName}: [
Line 47: Line 49:
// ..
// ..
}
}
hookBeforeFindDataServiceName: "xx", // if set will send async request to external service before finding data
hookAfterFindDataServiceName: "xx", // if set will send async request to external service before finding data
}
}
},
},
Line 55: Line 59:
<syntaxhighlight lang="JavaScript">
<syntaxhighlight lang="JavaScript">
{
{
searchResultId: "xx" // main element for one set of search results, comes from {searchType}_{filterMainId}
searchResultId: "xx" // main element for one set of search results
searchDetailId: "xx" // hashes of other values that affect result data: {requestPropertiesHash}_{requiredDataHash}
searchDetailId: "xx" // hashes of other values that affect result data
    processStatus: "processing",
searchResultMainStatus: "processing",
    createTime: currentTime.getTime(),
requiredData: {} // same as request
    expiryTime: expiryTime.getTime(),
childSearchIds: {
{childSearchType}: [{array of childSearchIds}], // array of childSearchIds for this parent searchResultMain, split out by childSearchType
// ..
}
createTime: currentTime.getTime(),
expiryTime: expiryTime.getTime(),
}
}
</syntaxhighlight>
</syntaxhighlight>
Line 65: Line 74:
* partition key: searchResultId
* partition key: searchResultId
* sort key: searchDetailId
* sort key: searchDetailId
* at the moment a parent Main can only have 1 child SearchResultMain (in future could probably change to a list of child Main’s pretty easily). One child SearchResultMain can have many parent SearchResultMain’s pointing to it
* searchResultId: for standard single filter searches is {searchType}_{filterMainId}
* searchDetailId: for standard single filter searches is {requestPropertiesHash}_{requiredDataHash}
* searchResultId: for combined search results is {searchType}_{searchParamsHash}
* searchDetailId: for combined search results is {requiredDataHash}


== SearchResultParents ==
== SearchResultParents ==
Line 73: Line 85:
childSearchId: "xx", // {searchResultId}_{searchDetailId}
childSearchId: "xx", // {searchResultId}_{searchDetailId}
parentSearchId: "xx" // {searchResultId}_{searchDetailId}
parentSearchId: "xx" // {searchResultId}_{searchDetailId}
searchResultParentStatus: "processing",
}
}
</syntaxhighlight>
</syntaxhighlight>
Line 84: Line 97:
<syntaxhighlight lang="JavaScript">
<syntaxhighlight lang="JavaScript">
{
{
searchDataId: "xx", // {searchResultId}_{searchDetailId}
searchId: "xx", // {searchResultId}_{searchDetailId}
dataId: "xx", // id of element (sellOfferId/productId/variantProduct(type and id))
dataId: "xx", // id of element (sellOfferId/productId/variantProduct(type and id))
// reconsider this field, does not scale: childData: {}, // only used when childId set in SearchResultMain, saves requiredData results for all matching child records
// reconsider this field, does not scale: childData: {}, // only used when childId set in SearchResultMain, saves requiredData results for all matching child records
    SearchResultDataStatus: "processing",
searchResultDataStatus: "processing",
}
}
</syntaxhighlight>
</syntaxhighlight>


* partition key: searchDataId
* partition key: searchId
* sort key: dataId
* sort key: dataId


Line 100: Line 113:
requiredDataId: "xx", // {searchType}_{dataId}_{requestPropertiesHash} OR maybe hash of object of these values?
requiredDataId: "xx", // {searchType}_{dataId}_{requestPropertiesHash} OR maybe hash of object of these values?
fieldName: "xx",
fieldName: "xx",
value: "xx", // value found for this reqdata field
requiredDataValue: "xx", // value found for this reqdata field
requiredDataStatus: "xx", // waitingExternalData | complete | invalid
requiredDataStatus: "xx", // waitingExternalData | complete | invalid
receivedExternalData: .. // externalData fields received data from handler services
waitingExternalData: .. // externalData fields waiting for data from handler services
     createTime: currentTime.getTime(),
     createTime: currentTime.getTime(),
     expiryTime: expiryTime.getTime(),
     expiryTime: expiryTime.getTime(),
Line 110: Line 121:


* partition key: requiredDataId  
* partition key: requiredDataId  
* sort key: fieldName
* one record is one result we can calculate from external services, this can be shared by multiple SearchResultData records
* one record is one result we can calculate from external services, this can be shared by multiple SearchResultData records
* requiredDataId includes identifiers that can affect the data's value
* requiredDataId includes identifiers that can affect the data's value
Line 122: Line 134:
{
{
requiredDataId_fieldName: "xx", // {requiredDataId}_{fieldName}
requiredDataId_fieldName: "xx", // {requiredDataId}_{fieldName}
searchDataId: "xx"
searchId: "xx"
}
}
</syntaxhighlight>
</syntaxhighlight>


* partition key: requiredDataId_fieldName
* partition key: requiredDataId_fieldName
* sort key: searchDataId
* sort key: searchId
* can extract dataId from requiredDataId_fieldName to find primary key for SearchResultData
* can extract dataId from requiredDataId_fieldName to find primary key for SearchResultData
* used to create links between RequiredData results and all SearchResultData that need it
* used to create links between RequiredData results and all SearchResultData that need it
Line 135: Line 147:
<syntaxhighlight lang="JavaScript">
<syntaxhighlight lang="JavaScript">
{
{
parentId: "xx", // {parent searchType}_{parent dataId}_{child searchType}_{child filterMainId}
parentId: "xx", // {parent searchType}_{parent dataId}_{child searchResultId (searchType and filterMainId)}_{requestPropertiesHash}
childDataId: "xx" // {child dataId}
childDataId: "xx" // {child dataId}
}
}
</syntaxhighlight>
</syntaxhighlight>


* partition key: parentId
* partition key: parentId, identifies one parent dataId matching to a specific childSearchResult, excluding requiredData which is unimportant in this context, only need to specify child searchType/filterMainId/requestPropertiesHash to identify what matching one unique set of parent dataIds
* sort key: childDataId
* sort key: childDataId
* child filterMainId is required because we will have different ranges of childIds stored depending on what filter was used to find them
* child filterMainId is required because we will have different ranges of childIds stored depending on what filter was used to find them
Line 157: Line 169:
* eg: product searchType has sellOffer child searchType data
* eg: product searchType has sellOffer child searchType data
* eg: variantProduct searchType has product child searchType data
* eg: variantProduct searchType has product child searchType data
* parent must wait for child DigData to complete before can process parents DigData
* parent must wait for child FindData to complete before can process parents FindData
* child SearchResult might already have completed before the parent request was received (ie by a different request), need to account for this
* child SearchResult might already have completed before the parent request was received (ie by a different request), need to account for this
* (old design - can still use if have to, but would be best to separate results so can scale, eg if want to show details about child products, that could pass as a separate Search/Sort set of results including pagination) aggregate multiple child data into the parents data values, but also keep a record of all children in the parents data record
* (old design - can still use if have to, but would be best to separate results so can scale, eg if want to show details about child products, that could pass as a separate Search/Sort set of results including pagination) aggregate multiple child data into the parents data values, but also keep a record of all children in the parents data record
Line 175: Line 187:
= Notes =  
= Notes =  


* Consider requests dig all possible needed data, even if the current request does not use it, that way subsequent requests can feed off the same processed data, ie standardize requiredData at request level to use cache more often
* Consider requests find all possible needed data, even if the current request does not use it, that way subsequent requests can feed off the same processed data, ie standardize requiredData at request level to use cache more often
* there might be a special case where we find all results with an empty filter, eg for a child Search Result request which is filtered only by the adult results and wants to not filter child results in any way, that would create a record for every item in the child request and include links to all parent dataIds, that data set could be used by any other request that matches searchType/requestPropertiesHash/requiredDataHash. Maybe work that in in the future as a separate check when processing parent requests so not need to send child request, or intentionally maintain that result set. Might be able to find other shared efficient results as well, however having cached results in RequiredData will also reduce a lot of work and maybe mitigate the need for this
* maybe could optimise querying data even further by copying final data for each RequiredData record into SearchResultData record, so not need to query RequiredData table, but maybe not needed as most consumer queries will point to SortResult


= Working documents =
= Working documents =

Latest revision as of 05:35, 3 November 2022

Overview

Service that handles search result requests, feeding work to Service - Search Result (handlers). Takes results from Complex Filter service which are identifier ids only and adds additional fields of information ready for display to end consumer, either directly if no sorting required, or via Sort Result service.

Functions as a cache of Search Result data, so if subsequent matching requests come in they do not need to pass on to Complex Filter or find required data.

Repository

https://bitbucket.org/izara-core-search/izara-core-search-search-results/src/master/

DynamoDB tables

Standard Config Table Per Service

Configuration tags

{
	configKey: "SearchType",
	configTag: "xx" // {eg: sellOffer/Product/VariantProduct etc..}
	configValue: {
		complexFilterServiceName: "xx" // {service name of complex filter that handles this type}
		complexFilterType: "xx" // optional, if complexFilterType is different to searchType
		dataHandlerServiceName: "xx" {service name used to create function/queue names for finding data results, often a Manager service for the object}
		requireRequestProperties: {
			{requiredData fieldName}: [ // dependant on what requiredData requested
				"xx",
				// .. {properties that must be received in the request for this searchType to process correctly}
			]
		},
		parentDataIdentifierFields: [
			"xx", // these are always found with any request for this SearchType to add into ParentDataId table for each found data
		],
		childSearchResults: {
			{childSearchType}: { // searchType of child searchResult, matches filterType for building child complex filter
				searchResultsServiceName: "xx", // allows for results to be saved by different deployed services
				requiredData: { // requiredData we request from the child searchType depending on what requiredData fields are set in the parent's request
					{parent requiredData fieldName}: [
						"xx", // list of child requiredData fields to add
						// ..
					],
					// ..
				},
				passOnRequestProperties: [ // parent requestProperties to pass on to the child searchResult request, if not received in parent request will error
					"xx", 
					// .. 
				]
			},
			// ..
		}
		hookBeforeFindDataServiceName: "xx", // if set will send async request to external service before finding data
		hookAfterFindDataServiceName: "xx", // if set will send async request to external service before finding data
	}
},

SearchResultMain

{
	searchResultId: "xx" // main element for one set of search results
	searchDetailId: "xx" // hashes of other values that affect result data
	searchResultMainStatus: "processing",
	requiredData: {} // same as request
	childSearchIds: {
		{childSearchType}: [{array of childSearchIds}], // array of childSearchIds for this parent searchResultMain, split out by childSearchType
		// ..
	}
	createTime: currentTime.getTime(),
	expiryTime: expiryTime.getTime(),
}
  • partition key: searchResultId
  • sort key: searchDetailId
  • searchResultId: for standard single filter searches is {searchType}_{filterMainId}
  • searchDetailId: for standard single filter searches is {requestPropertiesHash}_{requiredDataHash}
  • searchResultId: for combined search results is {searchType}_{searchParamsHash}
  • searchDetailId: for combined search results is {requiredDataHash}

SearchResultParents

{
	childSearchId: "xx", // {searchResultId}_{searchDetailId}
	parentSearchId: "xx" // {searchResultId}_{searchDetailId}
	searchResultParentStatus: "processing",
}
  • partition key: childSearchId
  • sort key: parentSearchId
  • when a SearchResultMain is finished it checks this table to see if any parents need to be processed

SearchResultData

{
	searchId: "xx", // {searchResultId}_{searchDetailId}
	dataId: "xx", // id of element (sellOfferId/productId/variantProduct(type and id))
	// reconsider this field, does not scale: childData: {}, // only used when childId set in SearchResultMain, saves requiredData results for all matching child records
	searchResultDataStatus: "processing",
}
  • partition key: searchId
  • sort key: dataId

RequiredData

{
	requiredDataId: "xx", // {searchType}_{dataId}_{requestPropertiesHash} OR maybe hash of object of these values?
	fieldName: "xx",
	requiredDataValue: "xx", // value found for this reqdata field
	requiredDataStatus: "xx", // waitingExternalData | complete | invalid
    createTime: currentTime.getTime(),
    expiryTime: expiryTime.getTime(),
}
  • partition key: requiredDataId
  • sort key: fieldName
  • one record is one result we can calculate from external services, this can be shared by multiple SearchResultData records
  • requiredDataId includes identifiers that can affect the data's value
  • if keys too long for Dynamo we could hash an object containing all required fields and can add the values of each into the item separately, if needed
  • primary key can be used when result comes back from external service, so that message must include all values needed to make requiredDataId
  • can probably add fields during processing to store extra data for processing this SearchResultData record
  • if we want to find all records for one SearchResultData we could search for just the partition key, but it might return additonal fields used by other searches with different requiredData

RequiredDataSearchResultData

{
	requiredDataId_fieldName: "xx", // {requiredDataId}_{fieldName}
	searchId: "xx"
}
  • partition key: requiredDataId_fieldName
  • sort key: searchId
  • can extract dataId from requiredDataId_fieldName to find primary key for SearchResultData
  • used to create links between RequiredData results and all SearchResultData that need it

ParentDataId

{
	parentId: "xx", // {parent searchType}_{parent dataId}_{child searchResultId (searchType and filterMainId)}_{requestPropertiesHash}
	childDataId: "xx" // {child dataId}
}
  • partition key: parentId, identifies one parent dataId matching to a specific childSearchResult, excluding requiredData which is unimportant in this context, only need to specify child searchType/filterMainId/requestPropertiesHash to identify what matching one unique set of parent dataIds
  • sort key: childDataId
  • child filterMainId is required because we will have different ranges of childIds stored depending on what filter was used to find them

requestProperties

These are properties that are added to a Search Result request that affect the data found, they may be required or optional. Because these properties can affect the value of data they must be added to SearchResultMain identifiers to differentiate batches of results.

requireRequestProperties

Some requiredData fields will need additional properties to be received in the initial request.

  • example: variant/product/sellOffer pricing fields will require locationNodeId/s and browseQuantity
  • example: all translations require language/s

Moving child SearchResult data to parent

  • eg: product searchType has sellOffer child searchType data
  • eg: variantProduct searchType has product child searchType data
  • parent must wait for child FindData to complete before can process parents FindData
  • child SearchResult might already have completed before the parent request was received (ie by a different request), need to account for this
  • (old design - can still use if have to, but would be best to separate results so can scale, eg if want to show details about child products, that could pass as a separate Search/Sort set of results including pagination) aggregate multiple child data into the parents data values, but also keep a record of all children in the parents data record

Multiple Search Results services

  • design to allow for different implementations, eg catalogs can choose which Search Results service it uses
  • could achieve this be having a setting in Catalog handlers or Catalog Manager that records which Search Result service the Catalog uses

browseQuantity for calculating prices

  • when browsing, a user can define browseQuantity they want to buy (default 1), this would be injected into price calculation to obtain available price for each sellOffer
  • different to filters, eg availableQuantity, which determines which sellOffers are returned. (we might have a method/hook that validates or updates the complex filter so these settings don't conflict, but not needed)
  • added to requestProperties in request
  • I think if we request prices for a SellOffer in a quantity it cannot handle, eg browseQuantity is 100 but selloffer only has 20 remaining, the pricing function should return an error, could then remove (or mark as not enough quantity available) such sellOffers, so they might still exist in search results and maybe sort results but show quantity not/available.

Notes

  • Consider requests find all possible needed data, even if the current request does not use it, that way subsequent requests can feed off the same processed data, ie standardize requiredData at request level to use cache more often
  • there might be a special case where we find all results with an empty filter, eg for a child Search Result request which is filtered only by the adult results and wants to not filter child results in any way, that would create a record for every item in the child request and include links to all parent dataIds, that data set could be used by any other request that matches searchType/requestPropertiesHash/requiredDataHash. Maybe work that in in the future as a separate check when processing parent requests so not need to send child request, or intentionally maintain that result set. Might be able to find other shared efficient results as well, however having cached results in RequiredData will also reduce a lot of work and maybe mitigate the need for this
  • maybe could optimise querying data even further by copying final data for each RequiredData record into SearchResultData record, so not need to query RequiredData table, but maybe not needed as most consumer queries will point to SortResult

Working documents

Working_documents - Search Results