2021-02-22 - Maintaining change history using graph database

From Izara Wiki
Jump to navigation Jump to search

Considerations

  • Want to maintain a record of all changes an object undergoes including when and who
  • Gives the ability to roll back (or the effect of rolling back without losing history of changes) eg if one user acts destructively

Using graphs

Each set of changes can be recorded as a vertex in a graph connected to the underlying object's vertex, when a new change is made a new vertex is created and marked as the current version.

Situation 1: Boolean setting

If the setting being changed on an object is a single boolean that is part of a relationship with another vertex, eg a relationship that can be enabled/disabled, we could avoid having a whole new vertex for the settings and set the boolean in the edge that links to the object's properties.

We still want to record the change happened, when and by who, this could be accomplished by adding a new edge each time the boolean is changed on the object's vertex pointing to the user's vertex, this edge has a label stating they enable/disabled the object, and a timestamp of when.

Example: A child categoryTreeNode can be enabled/disabled, the edge linking the parent and child categoryTreeNodes has a boolean property whether it is disabled and the child categoryTreeNode adds a new edge to a user vertex each time this setting changes.

This has a benefit (I believe) of optimizing the common query finding which child categoryTreeNode's are enabled, because the setting is right there on the edge linking them to their parent categoryTreeNode.

An alternative to the edge having an enabled property is to have two edge labels, one for enabled and one for disabled, remove and add the alternative label when changing the setting. This has the downside of deleting edges which is more likely to introduce destructive bugs, but it has the upside of perhaps faster querying, only needing to check the edge labels rather than the edge labels and the edge property, for this reason I will try using this method.

Situation 2: Editable settings

Editable settings are often not part of a relationship, they belong to one vertex, in this case have a separate vertex from the object's main vertex that stores the settings. Each time the settings change a new Setting vertex is created.

The edge that links the Setting vertices to the object has from and to timestamp properties marking when the settings were the active settings.

The Setting vertex has an edge connecting to the user that made the settings.

The timestamp could be filtered to only return the most recent/current settings.

Considering an additional vertex(?) label that marks the current setting, improving query performance, most queries will only be wanting the current settings. When creating a new setting will need to remove this label from the previous one.

Alternatives

Logs could be used, but would need to be standardized to be able to programmatically re-apply old changes. In some cases such as infrequent project level settings, logs may be sufficient.

Other database methods. I understand some database engines have an automated way of recording changes and allowing efficient rollbacks, Reasons for using graphs are:

  • by using a standard structure which is also used in other areas of the project we can link multiple graphs easily into wider graphs for finding extended relationships
  • other relationships and data are already being recorded in graphs so a single engine is used to also maintain version history