2006-01-06

Safe Threads

I need to explain why safe threads are so important. Basically you have three options:
  1. Honour system. The programmer uses explicit locks and is expected to do things right. Any mistakes invoke a memory model (as in Java) or undefined behavior (as in C). Either way you get bogus results. May significantly harm performance when the compiler can't reason about the locks used, as it has to fall back to the most generic code possible (that conforms to the memory model.)
  2. Compiler-enforced locks. As above, but when the compiler can't reason about a lock it emits an error instead of emitting generic code. Performance is optimal, but forces the language specification to include compiler internals.
  3. Inherently safe primitives. Inter-thread queues that only permit deeply immutable (or otherwise thread-safe, such as the queue itself) objects as contents. Atomic reference objects that allow alteration and retrieval of a single reference in an atomic (thread-safe) manor, and again requiring that reference to be deeply immutable. These primitives cannot be subverted, thus keeping the compiler easy, code reliable, and language specifications small.
The first option violates many of Python's principles:
  • Explicit is better than implicit.
  • Readability counts.
  • Errors should never pass silently.
  • In the face of ambiguity, refuse the temptation to guess.
  • If the implementation is hard to explain, it's a bad idea.
The second option, while better, still violates at least one principle:
  • If the implementation is hard to explain, it's a bad idea.
I believe the third option is the only acceptable one for python.

BTW, "deeply immutable" means that the object itself is immutable, all objects it references are immutable, all objects they reference are immutable, etc.

With the third option most of it could be kept out of the language specification, although it would need to be a core part of the implementation, not just an extension module. Unfortunately there is one aspect of the language that requires such threading, namely finalization (including weakref callbacks.) Not everybody realizes this but finalization is a form of concurrency. It causes functions to execute at unpredictable times and with no natural granularity to determine how they should interact with already executing functions.

The only reasonable way I see to handle finalization is to use safe thread mechanisms. It doesn't matter whether this is done using a single queue that the programmer must check explicitly, or by spawning a new safe thread for each finalizer function, just so long as it's entierly safe and predictable.

No comments: