2021-02-25 - Graph database ideas
Preface
Planning on moving a lot of the project's data into graph database/s, primary reasons:
- allows for effective querying according to relationships, and their will be a lot of this, eg: location tree, catalogs, production attributes, following actions performed by a user, buyer habits etc
- elegant methods of versioning data, almost all data in the project will be immutable, so need to version all data that can be changed
One huge graph, per service, or per stack?
One Project Graph containing all graphed data
Pros:
- ready for any type of relationship query we want to perform
Cons:
- all graph databases seem to have some form of scaling issue, eg neo4j does not index edges so if one vertex has many the modelling starts to fall apart
- Neptune specifically seems to struggle with large datasets, and has a limit on the number of edge labels/properties, difficult to apply this to an ever more complex huge project graph
- lack of compartmentalization, easier for mistakes to affect other areas of the project
- one point of failure
One graph database per service
Pros:
- compartmentalized
- smaller data sets
Cons:
- Lose a lot of the gains of a graph database, eg rich relationship network
Multi service graphs
Planning on using this middle ground, where a logical set of data will be stored in one graph that is shared by multiple services, often per stack.
Plan on keeping the structures standard (eg user object is the same type of object across all graphs, and common edge labels kept the same) so can easily patch these graphs together into a larger graph if needed for relationship querying.
Service that manages a graph
When a graph is shared across multiple services (and most will likely have that possibility, so probably all graphs), deploy as a separate service. Consider making this service standard/generic, with config settings for each implementation.
This service could have settings such as which data elements can be edited and which cannot, with a Lambda that checks this, securing immutable data.
Could standardize tasks like updating versioned data of different types 2021-02-22 - Maintaining change history using graph database