Numbering Copies Using Enumerators

by Greg Navis ⟨contact@gregnavis.com⟩

Copying a file in a file manager results in adding a copy counter to the file name. In this article, we’ll devise a simple and elegant algorithm based on Enumerator for doing the same in a Rails app.

Problem Statement

Imagine we’re working on a CMS and need to implement a copy page feature. Each page has a unique slug and one of the requirements is generating a new slug for the copy. For example, copying /about-us should result in /about-us-copy-1. There’s also an easy to overlook use case we need to address too – copying a copy. A naive implementation might turn /about-us-copy-1 into /about-us-copy-1-copy-1 instead of /about-us-copy-2.

The requirements can be broken down into the following three points:

  1. Copying a page for the first time should result in appending -copy-1 to the slug.
  2. Copying a page that already has a copy should increment the copy counter by one.
  3. Copying a copy should increment the counter already present in the slug.

We’ll take a bottom-up approach and start with slug uniqueness.

Database Uniqueness Constraints

A uniqueness guarantee safe from race conditions needs an index to enforce the constraint at the database level. Without it, if we copy a page twice rapidly we risk creating two pages with the same slug.

We start by creating an index to enforce the constraint:

add_index :pages, :slug, unique: true

We don’t need a model validation because we won’t let uniqueness violations propagate to the user. From his perspective, the copy operation will just work.

After creating the index, we need to find out how Active Record signals uniqueness violations. An experiment in the development console indicates that it raises ActiveRecord::RecordNotUnqiue. Additionally, we can use #cause to access the original exception raised by the database adapter. We assume we’re using PostgreSQL but adding support for other databases should be a breeze after the code is in place.

In order to make the code future-proof, we need to determine which uniqueness constraint was violated because e shouldn’t increase the copy counter if we violate a different uniqueness constraint. Unfortunately, it seems the only method is parsing the error message.

In PostgreSQL, error messages are available on e.cause.message and look like:

ERROR:  duplicate key value violates unique constraint "index_pages_on_slug"
DETAIL:  Key (slug)=(about-us) already exists.

We’ll add a private method to Page that extracts the constraint name from the exception message:

CONSTRAINT_NAME_REGEXP = %r{\AERROR:  duplicate key value violates unique constraint "(.*)"$}

def violated_constraint_name(exception)
  match = CONSTRAINT_NAME_REGEXP.match(exception.cause.message)
  match && match[1]
end

We can now use it to implement a predicate for detecting slug index violations:

UNIQUE_INDEX_ON_SLUG = 'index_pages_on_slug'

# The method should be called from a rescue clause for ActiveRecord::RecordNotUnique.
# That's why we don't check the class of the exception.
def slug_uniqueness_violation?(exception)
  violated_constraint_name(exception) == UNIQUE_INDEX_ON_SLUG
end

We could use meta-programming to get the index name from the database at runtime but such extra complexity doesn’t seem to be worth it in this case.

Armed with these methods, we can proceed to actually generating slugs.

Generating Slugs with Enumerators

We can elegantly address all the requirements at once by using enumerators. We need to find out the original slug and turn it into an infinite sequence of copy slugs. The sequence needs to be based on the original slug in order to address requirement 3.

Let’s start with conversions between original and copy slugs. These methods operate on a single slug but we’ll use them in the enumerator:

def original_to_copy(original_slug, copy_count)
  "#{original_slug}-copy-#{copy_count}"
end

COPY_SLUG_REGEXP = %r{\A(.*)-copy-\d+\z}

def original_slug(slug)
  match = COPY_SLUG_REGEXP.match(slug)
  if match
    match[1]
  else
    slug
  end
end

These method allow us to implement #copy_slugs as:

def copy_slugs(slug)
  slug = original_slug(slug)

  Enumerator.new do |slugs|
    (1..).each do |count|
      slugs << copy_slug(slug, count)
    end
  end
end

Notice the code uses infinite ranges added in Ruby 2.6. In earlier versions, we’d need to use loop and increment count ourselves. The enumerator is an infinite sequence of slugs of the form #{original-slug}-copy-#{copy_count}.

We’re now ready to implement the copy operation.

Tying it All Together

The last step is using our newly created methods when copying a page. To create a copy, we duplicate the model, take the first slug from the sequence and save. If it succeeds then we’re done. If it violates the slug uniqueness constraint then we retry with the next slug from the sequence.

def copy
  copied_page = dup

  copy_slugs(slug).each do |copy_slug|
    copied_page.update!(slug: copy_slug)
  rescue ActiveRecord::RecordNotUnique => e
    if slug_uniqueness_violation?(e)
      next
    else
      raise
    end
  end
end

One downside of this approach is we always start with copy-1 even if it already exists. We could try finding the highest-numbered copy in the database and start from there but this would complicate the code. Assuming the copy feature is seldom used and there are at most a few copies at a time then our implementation is a good balance between performance and clarity.

We’re almost done! The last mandatory step is extracting the methods and constants we added into a separate class to avoid polluting Page. I’ll leave it as an exercise to the reader. We may also limit the number of generated slugs in order to avoid an infinite loop in production that can easily exhaust our pool of workers. To do that, we should replace copy_slugs with copy_slugs.take(MAX_COPY_SLUGS).

Closing Thoughts

Generating names of copies isn’t necessarily a difficult problem but there are edge cases that can result in a convoluted implementation. Using database constraints and enumerators results in an elegant solution.