Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The TL/DR on how:

1. Causal consistency with the ability to create collections (tables) with strict consistency - uses vector clocks and not wall clock timestamps for ordering DB operations

2. Streams to propagate DB changes from one geo location (node) to another with guaranteed ordering and reliable delivery

3. Generalized Operational CRDTs to make all DB operations Associative, Commutative, Idempotent and Distributed (the new ACID)

4. One data model - multiple interfaces - store and query your data as Key/Value DB, DocumentDB, Graph DB, StreamDB, Geo location DB (query by lat/long/altitude)

5. Automatic GraphQL and REST api generation for your schema

6. Multi master - query and update any of our 25 global locations and get one consistent view of data globally

7. Access via CLI, REST, GraphQL (built in server), client drivers for python, Javascript today (Java, Go, Ruby in the works)



Translation:

There are two "modes" of operation of this product: "regular" and "SPOT collections".

In regular operation mode, the system will lose concurrent writes and cannot enforce ACID (the classical one) transactions. Your data is however highly available, eventually consistent and copied around the globe for faster access.

With SPOT collections you get ACID transactions but lose the distribution advantages. That is, the system's properties are akin to what you get with a postgres/mysql cluster.

It is an interesting product, but IMHO you should only go there if/when: your users are truly all over the world, and your system (or that part of it) requires few or no ACID transactions (think twice because this is a big one).

PS: This seems to be a "productified" version of the gun [1] database (not that this is a bad thing!). Am I right?

[1] https://github.com/amark/gun


Yes - you hit the nail in the head - that is indeed how SPOT and regular collections work.

We think we provide the flexibility of both very strict acid behavior using spot collections Along with the strong eventual consistency model for everything else. Important to understand that as a developer you don’t need to deal with any of it - it’s handled for you once you have marked a collection as a spot.

This is not related to gun in any way. We love what gun does and it’s approach and they do some very clever things (including work on the end device). We sit as a back end database as a service in 25 global pops and process dB operations (and code expressed as functions or containers) at the closest location (by latency or geo physical location) to the user or device using an app or api rubbing against us.

we wrote our own operational CRDT engine And streams to solve this.


This is buzzword salad, not a coherent technical summary.


ok Take 2:

1. We use causal consistency using vector clocks to establish the causal order of operations instead of using Time stamping and Network Time Protocol which are unreliable over WAN distances

2. We share Database state changes/updates between each Point of Presence (POP) to the others in the cluster using Asynchronous Streams that use a pull model instead of a push to maintain ordering of messages over the network. This has the added benefit of allowing us to know exactly how current each PoP is with changes.

3. We dont use Quorum for inter PoP to establish consistency - there's a white paper on our site that shows you the how and why of coordination free replication. Gist is we have developed a generalized operational CRDT model that allows us to get Associtive, Commutative, Idempotent convergence of changes between PoPs in the cluster without needing quorum

4. the DB is a multi master - you can access and change data at any PoP. Its also multi model and lets you query your data across multiple interfaces such as key/value, documentDB (JSON), Graph etc

5. the DB automatically creates GraphQL and REST APIs for your schema taking away the complexity and effort of a lot of boilerplate development on the backend

6. The DB is available as a managed service in 25 PoPs today - you can request an account and we will give you one. WE will be generally available with a free tier in April and you can signup online and self administer your cluster

7. You can access the DB via a CLI, GUI, or write code that accesses it via REST, GrapHQL or using native language drivers in JavaScript, Python today (we are working on other languages with a view on releasing them over the next few months)

hope this helps...


REST interface is good, but could we add business valdation before mutating data? Because bulk of the work backend does is this business logic. How could we do this?


Hello and greetings the_arun.

The short answer is yes. Macrometa integrates a function as service (FaaS) which can be hooked into the database and be triggered by events on a stream or a data collection.

So you can for example do the following: Expose a RESTful or GraphAPI (included deep nested queries in graphQL) for one or more collections - when mutating, attach a validation function to the collection as a trigger that is called before the mutation is applied to the DB. You can also have a trigger that calls a function after the mutation is complete.

One can also do this on streams with functions being triggered to a specific topic.

Lastly - there is full support for running containers as well and you can use the endpoints exposed by the container as a trigger.

Oh and one more thing - the dB is real time. It will notify clients of updates to collections automatically (like firebase).

Hope this helps..


Thanks for the details. This sounds very similar to Amazon's DynamoDB? Are there features to make macrometa better than DynamoDB?


Shares some feature overlap with dynamodb (key/value and document dB interfaces). Where we differentiate - global replications across all our 25 global POPs (50 by end of 2019). Integrated graphQL generator (rest as well), real-time: dB will notify clients of changes to data I.e. no need to poll, rightly integrated with streams and pub/sub, run functions and containers as triggers or stored procedures to the DB, geo query: query by lat/long/height, elastic search integrated (July 2019). There’s more - will announce in April


I would say it’s fairly easy to read. Easier to read than the article. Don’t shoot the messenger.


Hi sagichmal, why do u think it is buzzword salad? Are the problems mentioned in post with regards to current databases and crdt not real?


That's quite a harsh comment without anything substantial. It's really unfair to the parent comment who attempted to help the HN reader crowd with a summary.


I, for one, appreciated this summary.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: