Concurrency makes everything hard
For many programmers, writing efficient concurrent datastructures is an exercise in scouring github and hoping that code quality and number of “stars” really do correlate. When it comes time to write your own concurrent datastructures, queue many sleepless nights spent worrying about whether this or that particular concurrency primitive is fast enough, or correct enough, to make the code work at the speeds you need. If there is any solace felt, it comes from the once-safe assumption of strong consistency; that, no matter how many races we might have missed, at least the result of the program will be some valid interleaving of concurrent threads.
This assumption is wrong.
Because hardware isn’t as helpful as it once was. Modern processor and language memory models reduce the available consistency assumptions so much that it’s possible to read values which were never written, justify your own reads, and arbitrarily miss updates from concurrent threads. Simple patterns, like initializing an object and then handing it off directly to a newly-spawned thread, are simply not guaranteed to work.
And distribution makes it harder.
In a distributed setting, the costs of using stronger consistency models grow exponentially with the number of replicas in your system (essential for fault-tolerance) and the distance between these replicas (essential for global performance and availability). While there are a lot of good people doing excellent work to try and bring strongly-consistent shared data to a geo-distributed setting, the fundamentals are still not in our favor. And, far more urgently, major parts of the web have already moved to weaker consistency models.
We’re going to fix that. Or at least help.
This project aims to bring the power of programming languages to bear on the world of weakly-consistent distributed programming. Our secret sauce is information flow, a key technology borrowed from the security literature and applied with new life here. Using information flow, it’s possible to ensure that weakly-consistent observations can never unduly influence strongly-consistent computations; it makes the points of crossover explicit, statically requiring that programmers upgrade the consistency of their observations when appropriate.
And we’re not done yet.
There are a lot of excellent ideas out there that have their own ways of ensuring weak consistency doesn’t mean weak guarantees; everything from data types with guaranteed convergence, to techniques for guaranteeing only what computations need, and a whole host in-between. We believe that there are many unrealized synergies between our established information-flow approach and these robust alternate methods. We’re taking advantage of these synergies to build a new language for programming distributed systems.
Scoped strong semantic consistency is the goal.
This property comes under a lot of different names; you may have heard it as strong eventual consistency, or convergence, or full-program commutativity to name just a few. Simply stated, this is the property that all observable effects generated by your program under weakly-consistent, asynchronous replication—any way in which it interacts with the outside world—would still be generated under a perfectly-consistent, sequential execution (given the same environment and inputs). It says that every witness to weakly-consistent replication, every race condition, every speculative execution is in fact justified; we have, somehow, always “guessed” right. This property is exceptionally strong (in fact, it’s likely too strong), but it makes a noble goal; we want to empower programmers with the guarantee that their entire program will be consistent by default, while providing them the flexibility to break these guarantees with clear-eyed intention; and we want to preserve these guarantees across the great majority of the program even in the presence of small intentional violations. A new programming paradigm is coming, as fast as we can write it down. Watch this space!