Naively it looks like your conflict resolution is prone to livelock, where concu...

knz42 · on May 5, 2016

When transactions conflict, the priority of the losing transaction is internally ratcheted up. At some point it's just higher than everything after it and it succeeeds.

mrtracy · on May 5, 2016

(author)

This is a good insight; Live Lock was one of the things that kept me up at night, although my fellow contributors have allayed my concerns considerably.

In a write-heavy workload where the transactions follow a read-modify-write pattern, it does seem possible that your application could be facing live lock concerns. So, let's look at where those concerns come from in CockroachDB:

1. For write-write conflicts, live-lock concerns comes from the fact that a later transaction can abort your in-progress transaction if it has a priority. On top of this, if your transaction ends up aborting an earlier transaction, the other transaction may retry with a potentially higher priority, and this can result in a "priority war" of sorts.

2. "Read-write" conflicts have a worse problem: if a transaction with a later timestamp reads a key before your transaction writes to it, then your transaction always aborts. In read-modify-write workloads where the read is first, this seems to be the biggest theoretical source of live-lock.

Now, as for dealing with those issues.

+ The "Write-write" issue is the smaller concern here. The ratcheting-up of priorities is probabilistic; currently, after enough retries your transaction will have a high enough priority that it will almost certainly make progress. However, we are not totally content with this: we have an issue for our 1.0 release (https://github.com/cockroachdb/cockroach/issues/5727) to do some serious investigation of this, with some alternatives already suggested. For example, one such alternative is using "lowest original timestamp wins" to settle conflicts instead of a random priority.

+ The read-write concern seems more problematic; our read timestamp cache retains minimal data due to memory concerns, and thus aborts a bit conservatively (i.e. aborts transactions that might have been able to continue). This can be dealt with by modifying your application pattern - in particular, I would suggest using our "SNAPSHOT" isolation level for many workloads. I didn't cover this in the blog post, but in short the SNAPSHOT mode allows RW conflicts to occur without aborting transactions. SNAPSHOT transactions are subject to the "write skew" anomaly, but this anomaly does not occur if the involved transactions write to a common key. This is fortunately the case for many OLTP-type workloads, meaning SNAPSHOT can be used without anomalies; this also moves conflict-detection responsibility entirely to WW conflicts, which abort less conservatively than RW and should ameliorate any live-lock problems.