We're building an open source database like this. It's document-oriented and relies on a transaction log core (currently using Postgres, but it's pluggable), that feeds into a query subsystem (currently layered on top of Elasticsearch, but also pluggable).
The transaction log encourages small logical "patches" (set a field, increment a number, replace a substring, move an array element, etc.) that are applied in sequence but can be disentangled by clients to generate a consistent UI, and also used to resolve conflicts between multiple distributed writers. You can also follow the log through the gRPC and HTTP APIs, and you can register "watches" on queries that cause matching changes to trigger an event.
While the transaction log is the underlying data model, we also maintain a consistent view of the current version so that you can use it as a document database with classical CRUD operations. So on the surface it's a lot like Firebase or CouchDB, except you get things like joins and schemas.
Drop me an email (see profile) and I can send you some links.
In our case, we wanted something more compact, so a query looks more like GraphQL (it's technically a superset of JSON, but it usually doesn't look like JSON). Joins, for example, are just nested declarations that list which attributes on the joined collections to fetch. Here's a query that shows many of the features:
this looks very similar what I am doing, the difference appears to be that in delta a number of operations is just a subset that can be done in your system. for example it only allows the set operation on scalars/string fields and push operation on vectors, also the delete operations are placed in a way in which automatically let the merge algorithm to know if it needs to check past version in order to compute the current version of the resource, it also allows to compact all the history of a resource in distributed way.
Not sure what you mean by subset. In our case, a single transaction contains one or more document operations (create, delete, etc.), one of which is a "patch" operation that applies a fine-grained transformation. The transformations use an extended version of JSONPath in order to be able to target deep tree nodes as well as apply transformations to multiple fields (e.g. authors[0].publications[*].title). Operations such as set, increment etc. are specific to the data type of the value being transformed and will fail if the data type does not support the transformation function.
lobster nice, sorry for my bad english, is not native language :), in delta each message can be just appended to their past versions (binary format, not parsing required and based on flatbuffers) , in the same memory or disk region, which I have called a superposition (a superstition can represent any resource) you can see an image here https://github.com/nebtex/delta/blob/master/docs/version-lin..., each time that you append a message the tables in the new message are linked to their immediate past version if they exist (you can see tables like nested messages on protobuff), but if a deletion message of that table is found it does not create the link, when the program tries to find a field in a table it lookups in the latest message firsts and then go to past version till it found something, I believe that this should be fast due to how cache works in the modern computer architectures (still not tested), the superposition can be compacted to a single message, in order to free space, also the compaction can run in parallel if you have a lot of messages, for example is possible to maintain all the mutation of the db in a distribuited log, and if people need to recreate the latest state or any past state, should be a fast operation due that is possible to use all the nodes availables. I need to work in a better doc for sure, if someone has some recommendation to give me it will really nice, hehe.
The transaction log encourages small logical "patches" (set a field, increment a number, replace a substring, move an array element, etc.) that are applied in sequence but can be disentangled by clients to generate a consistent UI, and also used to resolve conflicts between multiple distributed writers. You can also follow the log through the gRPC and HTTP APIs, and you can register "watches" on queries that cause matching changes to trigger an event.
While the transaction log is the underlying data model, we also maintain a consistent view of the current version so that you can use it as a document database with classical CRUD operations. So on the surface it's a lot like Firebase or CouchDB, except you get things like joins and schemas.
Drop me an email (see profile) and I can send you some links.