Making Lazy Accessors Thread-Safe
What’s scarier than meta-programming? Thread-safe meta-programming!
The last article discussed implementing lazy accessors using a bit of meta-programming. The result was simple, but it suffered from a serious flaw: lack of thread-safety. In this article we’ll make that implementation thread-safe. Let’s start by clarifying the guarantee to be made to callers.
Problem Statement
Lazy accessors were defined via means of a new method on Class
called lazy
.
Let’s have another look at its source code to understand why it’s not thread-safe.
If multiple threads call the accessor for the first time then there’s a race condition that can cause the initial value to be calculated multiple times.
def lazy name, &definition
variable_name = :"#@{name}"
define_method(name) do
# If two threads call the method and both don't see the variable defined ...
if instance_variable_defined? variable_name
instance_variable_get variable_name
else
# ... then both will evaluate the block and set the attribute.
instance_variable_set variable_name, instance_eval(&definition)
end
end
end
If multiple threads are involved then lazy
may fail to deliver on the guarantee that the definition will be evaluated at most once.
The rest of the article is concerned with delivering on that guarantee.
It’s not the only way in which lazy
can be considered not thread-safe, and other failure modes are discussed at the end of the article.
A Non-Meta Solution
Meta-programs are programs that write programs. It’s often helpful to start with the program we want to get written and work backwards to a meta-program. Let’s consider a lazy accessor that determines a person’s full name based on first and last names.
# The following lazy accessor is equivalent to ...
lazy(:full_name) { "#{first_name} #{last_name}" }
# ... a regular method with no meta-programming.
def full_name
if defined?(@full_name)
@full_name
else
@full_name = "#{first_name} #{last_name}"
end
end
It’s clear the regular version is not thread-safe: two threads calling #full_name
for the first time may both see that @full_name
is undefined and end up setting it.
Thread-safety is not related to meta-programming, but rather to how the resulting instructions, whether written manually or by a meta-program, are structured.
The race condition can be eliminated with a mutex.
A naive approach of wrapping the whole accessor body in Thread::Mutex#synchronize
would work, at the expense of making all subsequent reads block.
This is a high price to pay, especially that the mutex is only needed on first access.
There are multiple ways of making this method thread-safe, but we’ll focus on one that turned out to be amenable to meta-programming.
Since the mutex is needed only on first access, it can be set to nil
afterwards to signal the value has been already set.
The following listing shows a thread-safe implementation of #full_name
with comments explaining certain nuances related to thread-safety.
def full_name
# Assign the mutex to a local variable for thread-safety. Why? If we kept
# referring to @full_name_mutex then its value could change between the
# `if` and `synchronize` below. This would result in confusing `NoMethodError`
# exceptions in a piece of code that seems to be checking for `nil`.
mutex = @full_name_mutex
# If the mutex is defined then it's the first time the accessor is called.
# Keep in mind multiple threads may be executing the `if` below.
if mutex
# Threads attempt to grab the mutex, but only one can do so at a time.
mutex.synchronize do
# After the mutex is grabbed, if the mutex attribute is still non-nil it
# means it's the thread that won the race and grabbed the mutex first. In
# this case, the attribute must be set to an appropriate value and the
# mutex to nil.
#
# If the mutex attribute is nil it means it's the thread that lost the
# race, so the lazy attribute was already set by the first thread. The
# second thread doesn't have to do anything other than returning the
# attribute.
if @full_name_mutex
@full_name = "#{first_name} #{last_name}"
@full_name_mutex = nil
end
end
end
# All threads finish the method call by returning the attribute.
@full_name
end
That’s the entire thread-safe lazy accessor implementation. Turning it into a meta-program is a matter of doing “fill in the blanks” in reverse.
Meta-programming Enters the Scene
Converting the concrete accessor from the previous section into a meta-method is a matter of replacing direct attribute manipulation with instance variable methods.
Notice the method defined by lazy
differs in that it removes the mutex instead of setting it to nil
, but is otherwise equivalent.
However, there’s one large complication: the mutex is assumed to be defined in the class constructor.
Ensuring the meta version defines mutexes in the constructor proved to be as tricky as thread-safety itself, and is addressed in the next section.
def lazy name, &definition
variable_name = :"#@{name}"
mutex_name = :"#@{name}_mutex"
# The comments below show how the meta-method maps to the non-meta version.
define_method name do # def full_name
mutex = instance_variable_get mutex_name # mutex = @full_name_mutex
if mutex_name # if mutex
mutex.synchronize do # mutex.synchronize do
if instance_variable_defined? mutex_name # if @full_name_mutex
instance_variable_set variable_name, # @full_name = "#{first_name} #{last_name}"
instance_eval(&definition) #
remove_instance_variable mutex_name # @full_name_mutex = nil
end # end
end # end
end # end
#
instance_variable_get variable_name # @full_name
end # end
end
Hooking Up into Object Initialization
A naive approach of having lazy
define initialize
that sets up mutexes won’t work, because it may override or be overridden by the class constructor.
In feature_envy
I ended up solving the problem with a dedicated module with a constructor that initializes mutexes required by the class.
That module gets included into every class that makes use of lazy
.
The only limitation is classes that use lazy accessors and define custom constructors must call their parent’s constructors, but this seems to be a reasonable requirement in object-oriented programming.
Assuming the actual mutex initialization is delegated to a specialized factory object, the initialization module boils down to a few lines of code.
module LazyAccessor
# For now, it's assumed MutexFactory is defined and its only instance is
# exposed via a method on the LazyAccessor module.
@mutex_factory = LazyAccessor::MutexFactory.new
class << self
attr_reader :mutex_factory
end
# The initialization module can then simply delegate its job to the mutex
# factory. The module's only role was to hook up into object initialization,
# which it did.
module Initialize
def initialize(...)
super
LazyAccessor.mutex_factory.initialize_mutexes_for self
end
end
end
The mutex factory becomes the most interesting bit. It must support two operations: registering new lazy accessors and initializing mutexes on a given instance. Since the class will be used for creating mutexes it can map classes to mutex names they need. This is a great job for a hash of arrays.
module LazyAccessor
class MutexFactory
def initialize
# A hash mapping classes to arrays of mutex names is the entire state
# this class needs.
@mutexes_by_class = Hash.new { |hash, key| hash[key] = [] }
end
end
end
Registering a mutex means adding a new mutex to a class. Additionally, to reduce duplication the method will return the mutex name to the caller. We’ll see later why this is important. The entire method is just three lines long.
module LazyAccessor
class MutexFactory
def register_lazy_accessor klass, name
# Determine the mutex name, add it to the list, and return to the caller.
mutex_name = :"@#{name}_mutex"
@mutexes_by_class[klass] << mutex_name
mutex_name
end
end
end
Finally, we’re ready to add the actual mutex initialization. We can trivially obtain a list of mutex for the object’s class, but we need to remember to traverse the entire inheritance hierarchy. Without that, the object would not have mutexes for lazy accessors inherited from ancestors.
module LazyAccessor
class MutexFactory
def initialize_mutexes_for instance
# Start the traversal with the current class.
current_class = instance.class
while current_class
# Add mutexes defined for the class under consideration.
@mutexes_by_class[current_class].each do |mutex_name|
instance.instance_variable_set mutex_name, Thread::Mutex.new
end
# Move one level up the inheritance hierarchy.
current_class = current_class.superclass
end
end
end
end
After defining the initialization module and the mutex factory, we’re ready to hook them up to lazy
.
This is where the return value of register_lazy_accessor
comes into play.
def lazy name, &definition
variable_name = :"#@{name}"
# The mutex name returned by #register_lazy_accessor is used here in order to
# avoid code duplication.
mutex_name = LazyAccessor.mutex_factory.register_lazy_accessor self, variable_name
# The initialization module must be included when .lazy is called unless it's
# already included.
if !included_modules.include?(LazyAccessor::Initialize)
include LazyAccessor::Initialize
end
# Accessor definition can now proceed as previously.
define_method name do
# ...
end
end
Closing Thoughts
Lazy accessors were made thread-safe, and now uphold the guarantee to call their definition block at most once.
The presented implementation is not fully complete, as feature_envy
also takes care of lazy accessors defined inside modules, not only classes.
It’s also worth mentioning there are other race conditions. For example, a lazy accessor could get added after its calls has been instantiated, which would make existing instances gain the accessor method, but not the mutex required by it. I decided scenarios like that are such extreme edge cases that they weren’t worth addressing, at least not in the first version.
If you’d like to use lazy accessors in your project then I recommend you give feature_envy
a try.
In the next article I’ll walk you through adding support for object literals in Ruby.
Enjoyed the article? Follow me on Twitter!
I regularly post about Ruby, Ruby on Rails, PostgreSQL, and Hotwire.