How to Reduce Memory Usage by Tuning Gemfile
Rails is known for many things but memory effectiveness is not one of them. By default, it loads all gems, used and unused, which contributes to the overall memory footprint. Fortunately, we can easily eliminate this waste without touching the app.
image_magick
is a perfect example of this problem. If we manipulate images only in workers then there’s no reason to require it on web servers. Conversely, there’s no need to require web-related stuff on workers.
Let’s start by understanding how Rails apps manage their dependencies.
Bundler in Action
Rails manages dependencies with Bundler. Its responsibilities are:
- Resolving gem specifications to concrete gem versions.
- Installing the gems.
- Requiring the gems during the boot process.
We’re all familiar with steps 1 and 2 but not necessarily 3. We’ll focus on the last step as that’s where extraneous gems get loaded.
The idea is to split gems into groups like web
and worker
with shared gems added to the default group (no explicit group required) and making Rails require the right group depending on where it’s run. A complex app may need more groups, especially if there’s more than one type of worker, but to keep things simple we’ll just assume web
and worker
.
Let’s do some wishful thinking. We’d like to make the following Gemfile
do the trick:
gem 'rails' # Used on both web and worker servers.
gem 'sidekiq' # Same here.
gem 'pundit', group: :web # Web-only gem.
gem 'image_magick', group: :worker # Worker-only gem.
Obviously, Rails is unaware of our specialized groups so it won’t work. In order to figure out the best implementation, we’ll turn our attention to the boot process.
How Rails Applications Boot
At a high-level, the Rails boot process looks like this:
- The boot process is initiated by
config.ru
… - which requires
config/environment.rb
… - which requires
config/application.rb
… - which requires
config/boot.rb
… - which requires
bundler/setup
and then … config/application.rb
callsBundler.require(*Rails.groups)
.
We’re interested in the last two steps. Requiring bundler/setup
in step 5 adds all gems from Gemfile
to $LOAD_PATH
without requiring them yet so it doesn’t increase memory use.
After setting up $LOAD_PATH
, dependencies are required in config/application.rb
with a single line of code:
Bundler.require(*Rails.groups)
What Bundler.require
does is self-evident – it requires gems in the specified groups.
Rails.groups
is more interesting. It returns an array of groups to load. Normally, they’re production
and default
. However, the array depends on three factors:
RAILS_ENV
RAILS_GROUPS
which is a comma-separated list of extra groups to include- A hash passed as an argument. It maps group names to environments (as defined by
RAILS_ENV
) in which these groups should be included. For example,{ :frontend => [:web, :legacy_web] }
meansfrontend
should be required whenRAILS_ENV
is eitherweb
orlegacy_web
RAILS_GROUPS
sounds like exactly what we need!
Teaching Rails New Groups
You may think it’s enough to set RAILS_GROUPS
in production and call it a day but we also need to make it work in development, test and CI. Let’s take a look at each of these environment in turn.
Development Environment
The right setting depends on the project but for maximum convenience, developers may set RAILS_GROUPS
to web,worker
. This would keep the default Rails behavior of loading everything and side-step questions about the correct worker configuration. Persisting this setting is a matter of adding it to .rbenv-vars
or a similar file.
Let’s discuss two risks before moving on to the next environment:
- Developers may get confused when
RAILS_GROUPS
is missing or incorrect. - If we add or remove a group then all developers will need to update or they’ll run into the problem above.
Are these risks serious? It’s up to you to decide. If you’re concerned then the following snippet (to be used in config/application.rb
) may be a good trade-off between safety and complexity:
DEVELOPMENT_RAILS_GROUPS = 'web,worker'
if ENV['RAILS_GROUPS'].blank?
ENV['RAILS_GROUPS'] = DEVELOPMENT_RAILS_GROUPS
warn "RAILS_GROUPS is unset; defaulting to #{DEVELOPMENT_RAILS_GROUPS}"
elsif ENV['RAILS_GROUPS'] != DEVELOPMENT_RAILS_GROUPS
warn "RAILS_GROUPS is set to #{ENV['RAILS_GROUPS']} instead of #{DEVELOPMENT_RAILS_GROUPS}"
end
Bundler.require(*Rails.groups)
In addition to explicitly informing the developer which groups are loaded it also makes production work when RAILS_GROUPS
is missing.
Test Environment
All test files are usually run within one process which means we need web
and worker
to make all dependencies available.
We need to keep in mind the following risk: if we add a gem to the wrong group then the tests will pass but production will break. For example, if we add image_magick
to web
instead of worker
then the test suite will pass because it loads both groups. However, production workers are configured to only load the worker
group so image_magick
won’t be available there.
We can eliminate this risk in several ways but the most convenient one is detecting it on the continuous integration server. We don’t add new gems frequently enough to push this burden to developers.
Continuous Integration
As discussed in the section above, we need to split test runs across the groups. Specifically, instead of:
RAILS_ENV=test bundle exec rails test
we should be running:
RAILS_ENV=test RAILS_GROUPS=web bundle exec rails test --exclude test/jobs
RAILS_ENV=test RAILS_GROUPS=worker bundle exec rails test test/jobs
In general, each specialized gem group should have a separate test run. This will ensure our code will actually work in production.
Production
Last but not least, we need to set RAILS_GROUPS
in production or we won’t see any memory usage reductions. In order to prevent misconfiguration we may add a modified version of the snippet from the previous section:
DEVELOPMENT_RAILS_GROUPS = 'web,worker'
if ENV['RAILS_GROUPS'].blank?
ENV['RAILS_GROUPS'] = DEVELOPMENT_RAILS_GROUPS
warn "RAILS_GROUPS is unset; defaulting to #{DEVELOPMENT_RAILS_GROUPS}"
elsif !Rails.env.production? && ENV['RAILS_GROUPS'] != DEVELOPMENT_RAILS_GROUPS
# We don't emit this warning in production as it's expected to see RAILS_GROUPS
# set to a different value than the one for development.
warn "RAILS_GROUPS is set to #{ENV['RAILS_GROUPS']} instead of #{DEVELOPMENT_RAILS_GROUPS}"
end
Bundler.require(*Rails.groups)
Next Steps
These are all the boot process modifications we need to make. We’re ready to split gems into groups. Obviously, this is project specific but here are a few rules of thumb:
- API clients are frequently used by workers, as it’s an anti-pattern to make third-party API calls during the request-response cycle, so they belong to the
worker
group. - Frontend tooling is likely unused on workers and can be safely put in the
web
group. - Processing libraries that require lots of CPU are another candidate for the
worker
group.
Summary
The default Rails dependency management can easily lead to large memory footprint because it loads all gems even if they are unused. Splitting them into web- and worker-related groups and enhancing the boot process is a simple countermeasure that can be applied to any Rails project.
Enjoyed the article? Follow me on Twitter!
I regularly post about Ruby, Ruby on Rails, PostgreSQL, and Hotwire.