2021-02-25 - Graph database ideas

From Izara Wiki
Revision as of 12:01, 25 February 2021 by Sven the Barbarian (talk | contribs) (Created page with "= Preface = Planning on moving a lot of the project's data into graph database/s, primary reasons: * allows for effective querying according to relationships, and their will...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Preface

Planning on moving a lot of the project's data into graph database/s, primary reasons:

  • allows for effective querying according to relationships, and their will be a lot of this, eg: location tree, catalogs, production attributes, following actions performed by a user, buyer habits etc
  • elegant methods of versioning data, almost all data in the project will be immutable, so need to version all data that can be changed

One huge graph, per service, or per stack?

One Project Graph containing all graphed data

Pros:

  • ready for any type of relationship query we want to perform

Cons:

  • all graph databases seem to have some form of scaling issue, eg neo4j does not index edges so if one vertex has many the modelling starts to fall apart
  • Neptune specifically seems to struggle with large datasets, and has a limit on the number of edge labels/properties, difficult to apply this to an ever more complex huge project graph
  • lack of compartmentalization, easier for mistakes to affect other areas of the project
  • one point of failure

One graph database per service

Pros:

  • compartmentalized
  • smaller data sets

Cons:

  • Lose a lot of the gains of a graph database, eg rich relationship network

Multi service graphs

Planning on using this middle ground, where a logical set of data will be stored in one graph that is shared by multiple services, often per stack.

Plan on keeping the structures standard (eg user object is the same type of object across all graphs, and common edge labels kept the same) so can easily patch these graphs together into a larger graph if needed for relationship querying.

Service that manages a graph

When a graph is shared across multiple services (and most will likely have that possibility, so probably all graphs), deploy as a separate service. Consider making this service standard/generic, with config settings for each implementation.

This service could have settings such as which data elements can be edited and which cannot, with a Lambda that checks this, securing immutable data.

Could standardize tasks like updating versioned data of different types 2021-02-22 - Maintaining change history using graph database