Hi Martin, I now had time to read more closely. Thank you for citing us! In our ...

Hi Martin,

I now had time to read more closely. Thank you for citing us!

In our paper and original approach/algo (iirc), CRDT updates are applied from the bottom of the dag to the heads (older to recent). This is also what you propose although we don't discuss how this helps wrt BFT. I think it is a very good point to highlight that hash-graph based CRDTs (or Merkle-CRDTs as well call them) can provide this property when processed "in order".

The caveat is that, in practice we found this quite impractical: when receiving an update with unknown descendants, you'd have to choose whether to hold on it to apply it at a later point when the missing sub-dag is received (which may never happen), or to drop it and rely on re-transmission at a later point. The first opens the door to abuse by a bad actor because you spend memory holding on to un-applied updates. The latter causes additional work as updates could pile up over an missing one, thus incurring re-transmission costs for the missing and all the later elements.

This also means that peers pubsub systems should probably decide whether to re-broadcast an update based on whether is valid or not per the CRDT operation it contains, which can have bad side-effects: a network missing an update due to a network issue may block the broadcasting of later any later updates even if they are correct, thus worsening the delivery for everyone and forcing the network to issue more re-transmissions.

And as a result, a peer should also decide whether to issue/broadcast updates based on whether the previous update has at least been received by a non Byzantine replica, as otherwise updates built on top will not be accepted by the network until re-transmission. That makes another potential bottleneck.

In our implementation, we found more practical to process updates immediately as they are received, and then process descendants until we reach the "known" part of the DAG. This means every update can be broadcast, gossiped around, and will be potentially applied without doing any waiting for parents, and occasional loss of an update does not block the processing of new updates before re-transmission. If an update has descendants that do not exist then it can be considered a DAG with a different "root" and does not have many consequences in terms of convergence . Note that here we are talking about DAGs with depth of 100k+ items, where sometimes there are 200 heads and processing every update may take a few seconds. We need to avoid blocking a replica as much as possible and get as much work done as possible asap.

I think some CRDTs can get away with this (in our case, CRDT-add-only-sets, using the hashes as IDs) and be byzantine-failure resistant (things would converge). In the paper you mention some examples where CRDT-rules can be abused more easily, so I'm guessing it is more difficult other CRDT types. Ensuring that IDs are unique is one of the main advantages of using hash trees.

In general, I think an attacker can usually find ways to screw up with a non-permissioned CRDT system without breaking convergence (i.e by submitting many valid updates). Your approach makes however a very good point wrt to misbehaving nodes which are not necessarily malicious.