Making Lazy Accessors Thread-Safe

What’s scarier than meta-programming? Thread-safe meta-programming!

The last article discussed implementing lazy accessors using a bit of meta-programming. The result was simple, but it suffered from a serious flaw: lack of thread-safety. In this article we’ll make that implementation thread-safe. Let’s start by clarifying the guarantee to be made to callers.

Problem Statement

Lazy accessors were defined via means of a new method on Class called lazy. Let’s have another look at its source code to understand why it’s not thread-safe. If multiple threads call the accessor for the first time then there’s a race condition that can cause the initial value to be calculated multiple times.

def lazy name, &definition
  variable_name = :"#@{name}"

  define_method(name) do
    # If two threads call the method and both don't see the variable defined ...
    if instance_variable_defined? variable_name
      instance_variable_get variable_name
    else
      # ... then both will evaluate the block and set the attribute.
      instance_variable_set variable_name, instance_eval(&definition)
    end
  end
end

If multiple threads are involved then lazy may fail to deliver on the guarantee that the definition will be evaluated at most once. The rest of the article is concerned with delivering on that guarantee. It’s not the only way in which lazy can be considered not thread-safe, and other failure modes are discussed at the end of the article.

A Non-Meta Solution

Meta-programs are programs that write programs. It’s often helpful to start with the program we want to get written and work backwards to a meta-program. Let’s consider a lazy accessor that determines a person’s full name based on first and last names.

# The following lazy accessor is equivalent to ...
lazy(:full_name) { "#{first_name} #{last_name}" }

# ... a regular method with no meta-programming.
def full_name
  if defined?(@full_name)
    @full_name
  else
    @full_name = "#{first_name} #{last_name}"
  end
end

It’s clear the regular version is not thread-safe: two threads calling #full_name for the first time may both see that @full_name is undefined and end up setting it. Thread-safety is not related to meta-programming, but rather to how the resulting instructions, whether written manually or by a meta-program, are structured.

The race condition can be eliminated with a mutex. A naive approach of wrapping the whole accessor body in Thread::Mutex#synchronize would work, at the expense of making all subsequent reads block. This is a high price to pay, especially that the mutex is only needed on first access.

There are multiple ways of making this method thread-safe, but we’ll focus on one that turned out to be amenable to meta-programming. Since the mutex is needed only on first access, it can be set to nil afterwards to signal the value has been already set. The following listing shows a thread-safe implementation of #full_name with comments explaining certain nuances related to thread-safety.

def full_name
  # Assign the mutex to a local variable for thread-safety. Why? If we kept
  # referring to @full_name_mutex then its value could change between the
  # `if` and `synchronize` below. This would result in confusing `NoMethodError`
  # exceptions in a piece of code that seems to be checking for `nil`.
  mutex = @full_name_mutex

  # If the mutex is defined then it's the first time the accessor is called.
  # Keep in mind multiple threads may be executing the `if` below.
  if mutex
    # Threads attempt to grab the mutex, but only one can do so at a time.
    mutex.synchronize do
      # After the mutex is grabbed, if the mutex attribute is still non-nil it
      # means it's the thread that won the race and grabbed the mutex first. In
      # this case, the attribute must be set to an appropriate value and the
      # mutex to nil.
      #
      # If the mutex attribute is nil it means it's the thread that lost the
      # race, so the lazy attribute was already set by the first thread. The
      # second thread doesn't have to do anything other than returning the
      # attribute.
      if @full_name_mutex
        @full_name       = "#{first_name} #{last_name}"
        @full_name_mutex = nil
      end
    end
  end

  # All threads finish the method call by returning the attribute.
  @full_name
end

That’s the entire thread-safe lazy accessor implementation. Turning it into a meta-program is a matter of doing “fill in the blanks” in reverse.

Meta-programming Enters the Scene

Converting the concrete accessor from the previous section into a meta-method is a matter of replacing direct attribute manipulation with instance variable methods. Notice the method defined by lazy differs in that it removes the mutex instead of setting it to nil, but is otherwise equivalent. However, there’s one large complication: the mutex is assumed to be defined in the class constructor. Ensuring the meta version defines mutexes in the constructor proved to be as tricky as thread-safety itself, and is addressed in the next section.

def lazy name, &definition
  variable_name = :"#@{name}"
  mutex_name = :"#@{name}_mutex"

  # The comments below show how the meta-method maps to the non-meta version.

  define_method name do                                    # def full_name
    mutex = instance_variable_get mutex_name               #   mutex = @full_name_mutex
    if mutex_name                                          #   if mutex
      mutex.synchronize do                                 #     mutex.synchronize do
        if instance_variable_defined? mutex_name           #       if @full_name_mutex
          instance_variable_set variable_name,             #         @full_name = "#{first_name} #{last_name}"
                                instance_eval(&definition) #
          remove_instance_variable mutex_name              #         @full_name_mutex = nil
        end                                                #       end
      end                                                  #     end
    end                                                    #   end
                                                           #
    instance_variable_get variable_name                    #   @full_name
  end                                                      # end
end

Hooking Up into Object Initialization

A naive approach of having lazy define initialize that sets up mutexes won’t work, because it may override or be overridden by the class constructor. In feature_envy I ended up solving the problem with a dedicated module with a constructor that initializes mutexes required by the class. That module gets included into every class that makes use of lazy. The only limitation is classes that use lazy accessors and define custom constructors must call their parent’s constructors, but this seems to be a reasonable requirement in object-oriented programming.

Assuming the actual mutex initialization is delegated to a specialized factory object, the initialization module boils down to a few lines of code.

module LazyAccessor
  # For now, it's assumed MutexFactory is defined and its only instance is
  # exposed via a method on the LazyAccessor module.
  @mutex_factory = LazyAccessor::MutexFactory.new

  class << self
    attr_reader :mutex_factory
  end

  # The initialization module can then simply delegate its job to the mutex
  # factory. The module's only role was to hook up into object initialization,
  # which it did.
  module Initialize
    def initialize(...)
      super

      LazyAccessor.mutex_factory.initialize_mutexes_for self
    end
  end
end

The mutex factory becomes the most interesting bit. It must support two operations: registering new lazy accessors and initializing mutexes on a given instance. Since the class will be used for creating mutexes it can map classes to mutex names they need. This is a great job for a hash of arrays.

module LazyAccessor
  class MutexFactory
    def initialize
      # A hash mapping classes to arrays of mutex names is the entire state
      # this class needs.
      @mutexes_by_class = Hash.new { |hash, key| hash[key] = [] }
    end
  end
end

Registering a mutex means adding a new mutex to a class. Additionally, to reduce duplication the method will return the mutex name to the caller. We’ll see later why this is important. The entire method is just three lines long.

module LazyAccessor
  class MutexFactory
    def register_lazy_accessor klass, name
      # Determine the mutex name, add it to the list, and return to the caller.
      mutex_name = :"@#{name}_mutex"
      @mutexes_by_class[klass] << mutex_name
      mutex_name
    end
  end
end

Finally, we’re ready to add the actual mutex initialization. We can trivially obtain a list of mutex for the object’s class, but we need to remember to traverse the entire inheritance hierarchy. Without that, the object would not have mutexes for lazy accessors inherited from ancestors.

module LazyAccessor
  class MutexFactory
    def initialize_mutexes_for instance
      # Start the traversal with the current class.
      current_class = instance.class

      while current_class
        # Add mutexes defined for the class under consideration.
        @mutexes_by_class[current_class].each do |mutex_name|
          instance.instance_variable_set mutex_name, Thread::Mutex.new
        end

        # Move one level up the inheritance hierarchy.
        current_class = current_class.superclass
      end
    end
  end
end

After defining the initialization module and the mutex factory, we’re ready to hook them up to lazy. This is where the return value of register_lazy_accessor comes into play.

def lazy name, &definition
  variable_name = :"#@{name}"

  # The mutex name returned by #register_lazy_accessor is used here in order to
  # avoid code duplication.
  mutex_name = LazyAccessor.mutex_factory.register_lazy_accessor self, variable_name

  # The initialization module must be included when .lazy is called unless it's
  # already included.
  if !included_modules.include?(LazyAccessor::Initialize)
    include LazyAccessor::Initialize
  end

  # Accessor definition can now proceed as previously.
  define_method name do
    # ...
  end
end

Closing Thoughts

Lazy accessors were made thread-safe, and now uphold the guarantee to call their definition block at most once. The presented implementation is not fully complete, as feature_envy also takes care of lazy accessors defined inside modules, not only classes.

It’s also worth mentioning there are other race conditions. For example, a lazy accessor could get added after its calls has been instantiated, which would make existing instances gain the accessor method, but not the mutex required by it. I decided scenarios like that are such extreme edge cases that they weren’t worth addressing, at least not in the first version.

If you’d like to use lazy accessors in your project then I recommend you give feature_envy a try. In the next article I’ll walk you through adding support for object literals in Ruby.

I regularly post about Ruby, Ruby on Rails, PostgreSQL, and Hotwire.

Follow @gregnavis