Naively it looks like your conflict resolution is prone to livelock, where concurrent transactions for a key keep aborting each other without getting any work done. The same can happen if you run a single node RDBMS in serializable isolation mode and read a row before writing to it. In that case you can add locking reads on rows you intend to later write to, which avoids the livelock.
Have you done any studies to quantify how bad that effect is in cockroachdb? Assuming the effect exists and I didn't just miss something silly, are there any workarounds since cockroachdb is only OCC? (Exponential backoff between retries and keeping transactions short don't count, I assume those are a given)
When transactions conflict, the priority of the losing transaction is internally ratcheted up. At some point it's just higher than everything after it and it succeeeds.
This is a good insight; Live Lock was one of the things that kept me up at night, although my fellow contributors have allayed my concerns considerably.
In a write-heavy workload where the transactions follow a read-modify-write pattern, it does seem possible that your application could be facing live lock concerns. So, let's look at where those concerns come from in CockroachDB:
1. For write-write conflicts, live-lock concerns comes from the fact that a later transaction can abort your in-progress transaction if it has a priority. On top of this, if your transaction ends up aborting an earlier transaction, the other transaction may retry with a potentially higher priority, and this can result in a "priority war" of sorts.
2. "Read-write" conflicts have a worse problem: if a transaction with a later timestamp reads a key before your transaction writes to it, then your transaction always aborts. In read-modify-write workloads where the read is first, this seems to be the biggest theoretical source of live-lock.
Now, as for dealing with those issues.
+ The "Write-write" issue is the smaller concern here. The ratcheting-up of priorities is probabilistic; currently, after enough retries your transaction will have a high enough priority that it will almost certainly make progress. However, we are not totally content with this: we have an issue for our 1.0 release (https://github.com/cockroachdb/cockroach/issues/5727) to do some serious investigation of this, with some alternatives already suggested. For example, one such alternative is using "lowest original timestamp wins" to settle conflicts instead of a random priority.
+ The read-write concern seems more problematic; our read timestamp cache retains minimal data due to memory concerns, and thus aborts a bit conservatively (i.e. aborts transactions that might have been able to continue). This can be dealt with by modifying your application pattern - in particular, I would suggest using our "SNAPSHOT" isolation level for many workloads. I didn't cover this in the blog post, but in short the SNAPSHOT mode allows RW conflicts to occur without aborting transactions. SNAPSHOT transactions are subject to the "write skew" anomaly, but this anomaly does not occur if the involved transactions write to a common key. This is fortunately the case for many OLTP-type workloads, meaning SNAPSHOT can be used without anomalies; this also moves conflict-detection responsibility entirely to WW conflicts, which abort less conservatively than RW and should ameliorate any live-lock problems.
Have you done any studies to quantify how bad that effect is in cockroachdb? Assuming the effect exists and I didn't just miss something silly, are there any workarounds since cockroachdb is only OCC? (Exponential backoff between retries and keeping transactions short don't count, I assume those are a given)