Revision as of 13:22, 29 October 2022

Overview

Orchestrates importing of objects/data into project.

Repository

https://bitbucket.org/izara-core-import-export-data/izara-core-import-data-import-data/src/master/

DynamoDB tables

Standard Config Table Per Service

Configuration tags

{
	configKey: "objectType",
	configTag: "xx" // {objectType, eg: sellOffer/Product/VariantProduct etc..}
	configValue: {
		createObjectServiceName: "xx" // {service name service that handles this type}
		createLinkServiceNames: {
			"xx": "yy", // index is name of link to objectType, value is serviceName
		}
	}
},

ImportBatchMain

{
	importBatchId: "xx", // random uuid
	userId: "xx", // submitted by userId
	startTime: currentTime.getTime(),
	batchConfig: {}, // same as request
	status: "xx", // "* NOT YET:processingRawRecords" | "processingObjects" | "error" | "complete" 
}

partition key: importBatchId
sort key: {none}

ImportBatchErrors

{
	importBatchId: "xx",
	errorId: "xx", // random uuid
	error: "xx", 
}

partition key: importBatchId
sort key: {none}

RawRecord

NOT YET: maybe move into CSV processing, as each format might have own way of handling per line/record formats

Is a raw copy, split out into fields, of one submitted record. One record may have multiple objects in it's fields.

{
	importBatchId: "xx",
	rawRecordId: "xx", // random uuid
	fields: {}, // key is the name of the field
	recordNumber: ##, // eg the line number of the record
	status: "xx",
	errorsFound: {},
}

partition key: importBatchId
sort key: rawRecordId

RawRecordAwaitingProcess

NOT YET: maybe move into CSV processing, as each format might have own way of handling per line/record formats

Is a list of raw records waiting to be saved into PendingObjectMain so can be handled asynchronously, and trigger next step of process when all complete.

{
	importBatchId: "xx",
	rawRecordId: "xx",
}

partition key: importBatchId
sort key: rawRecordId

PendingObjectMain

One item per object that needs to be created.

{
	importBatchId: "xx",
	objectId: "xx", // importBatchId_random uuid
	objectType: "xx", // eg variant|product|sellOffer|sellOfferPrice|sellOfferPlan|...
	fields: {}, // key is the name of the field
	rawRecordId: "xx",
	status: "xx",
	errorsFound: {},
}

partition key: importBatchId
sort key: objectId

PendingObjectReference

Creates a link between submitted referenceId and saved object, so can find when other objects reference it.

{
	importBatchId: "xx",
	referenceId: "xx", // objectType_{feed supplied referenceId}
	objectId: "xx", // eg variant|product|sellOffer|sellOfferPrice|sellOfferPlan|...
}

partition key: importBatchId
sort key: referenceId
when creating maybe throw error if item exists with different objectId

PendingObjectAwaitingProcess

List of PendingObjectMains waiting to be either sent out to external service create function, and trigger next step of process when all complete.

{
	importBatchId: "xx",
	pendingObjectId: "xx", // {objectType}_{objectId}
}

partition key: importBatchId
sort key: objectId

PendingObjectProcessing

List of PendingObjects waiting to complete processing, used (with PendingLinkProcessing) to see if ImportBatchMain complete

{
	importBatchId: "xx",
	pendingObjectId: "xx", // {objectType}_{objectId}
}

partition key: importBatchId
sort key: objectId

PendingLink

One item per link between objects.

{
	importBatchId: "xx",
	pendingLinkId: "xx", // {objectId}_{referenceId}
}

partition key: importBatchId
sort key: pendingLinkId
currently think of objects that can be independently created, then the link gets made, but perhaps could also use for cases where one object must be created before another can be.

PendingLinkAwaitingProcess

List of PendingLinks waiting to validate reference and store awaitingSteps.

{
	importBatchId: "xx",
	pendingLinkId: "xx", // {objectId}_{referenceId}
}

partition key: importBatchId
sort key: pendingLinkId

PendingLinkProcessing

List of PendingLinks waiting to complete processing, used (with PendingObjectProcessing) to see if ImportBatchMain complete

{
	importBatchId: "xx",
	pendingLinkId: "xx", // {objectId}_{referenceId}
}

partition key: importBatchId
sort key: pendingLinkId

Process

* NOT YET: go through RawRecordAwaitingProcess and create items in PendingLink for each link between objects, PendingObjectMain for each item found, and PendingObjectReference for any references found (also PendingLinkAwaitingProcess and PendingObjectAwaitingProcess)
go through PendingLinkAwaitingProcess to check reference valid and create awaitingMultipleSteps for the object and the referenced object. Save into PendingLinkProcessing
if any errors found prior to this step stop processing and mark feed as status error
send PendingObjectAwaitingProcess to external services. Save into PendingObjectProcessing
as objects are created will trigger lambda that sets status of PendingObject, removes PendingObjectProcessing, and checks if any awaitingMultipleSteps exist for that object, if yes process the links. Check if ImportBatchMain complete.
if awaitingMultipleSteps exist send to external service to create the link
Lambda subscribes to create link flow, removes PendingLinkProcessing, checks if ImportBatchMain complete.

Object hierarchy and field schema

Some fields will be required, some optional
some fields possibly have system defaults
perhaps user can setup default templates (do later if has value)
schema will need to state identifier fields for each object, if set in feed Import Data knows is pointing to existing child/parent object, if empty needs to create new
perhaps each objectType states it's child objects, as more likely to be aware of these than parent objects

where to store/set schema

Considering external service delivers this to ImportData in Initial Setup, as seed data injected directly into Import Data Config Dynamo table.

Working documents

Working_documents - Import Data

@@ Line 34: / Line 34: @@
 	startTime: currentTime.getTime(),
 	batchConfig: {}, // same as request
-	status: "xx", // "processingRawRecords" | "processingObjects" | "error" | "complete"
+	status: "xx", // "* NOT YET:processingRawRecords" | "processingObjects" | "error" | "complete"
 }
 </syntaxhighlight>
@@ Line 56: / Line 56: @@
 == RawRecord ==
+* NOT YET: maybe move into CSV processing, as each format might have own way of handling per line/record formats
 Is a raw copy, split out into fields, of one submitted record. One record may have multiple objects in it's fields.
@@ Line 74: / Line 75: @@
 == RawRecordAwaitingProcess ==
+* NOT YET: maybe move into CSV processing, as each format might have own way of handling per line/record formats
 Is a list of raw records waiting to be saved into PendingObjectMain so can be handled asynchronously, and trigger next step of process when all complete.
@@ Line 194: / Line 196: @@
 = Process =
-# go through RawRecordAwaitingProcess and create items in PendingLink for each link between objects, PendingObjectMain for each item found, and PendingObjectReference for any references found (also PendingLinkAwaitingProcess and PendingObjectAwaitingProcess)
+# * NOT YET: go through RawRecordAwaitingProcess and create items in PendingLink for each link between objects, PendingObjectMain for each item found, and PendingObjectReference for any references found (also PendingLinkAwaitingProcess and PendingObjectAwaitingProcess)
 # go through PendingLinkAwaitingProcess to check reference valid and create awaitingMultipleSteps for the object and the referenced object. Save into PendingLinkProcessing
 # if any errors found prior to this step stop processing and mark feed as status error

Service - Import Data: Difference between revisions

Revision as of 13:22, 29 October 2022

Contents

Overview

Repository

DynamoDB tables

Standard Config Table Per Service

Configuration tags

ImportBatchMain

ImportBatchErrors

RawRecord

RawRecordAwaitingProcess

PendingObjectMain

PendingObjectReference

PendingObjectAwaitingProcess

PendingObjectProcessing

PendingLink

PendingLinkAwaitingProcess

PendingLinkProcessing

Process

Object hierarchy and field schema

where to store/set schema

Working documents

Navigation menu

Service - Import Data: Difference between revisions

Revision as of 13:22, 29 October 2022

Overview

Repository

DynamoDB tables

Standard Config Table Per Service

Configuration tags

ImportBatchMain

ImportBatchErrors

RawRecord

RawRecordAwaitingProcess

PendingObjectMain

PendingObjectReference

PendingObjectAwaitingProcess

PendingObjectProcessing

PendingLink

PendingLinkAwaitingProcess

PendingLinkProcessing

Process

Object hierarchy and field schema

where to store/set schema

Working documents

Navigation menu

Search