Scaling Rails: Understanding Puma Workers, Threads, and Database Connection Pooling

Introduction
Hello folks! in this article I’m going to be going through all the needed calculations to properly tune your rails app in production. This article aims to provide a schematic for calculating the threads needed both application wise and database wise so no request or background job fails to acquire a database connection when things get heated. Let’s start.
Puma and its workers
First of all let’s start off by talking about Puma which is the default rails web server and how it operates generally.
When booting up puma. It has configurations related to the number or workers and the number of threads per worker but what are both?
When setting the worker variable to 2 for example. Puma will then fork its operating system process however many times you set worker (in our case 2). This means you will have workers count of your rails code as instances ready to serve http requests.
In each puma worker there will be multiple threads based on the threads configuration. However due to the GIL lock in ruby, only one thread can be executed at a moment of time unless this thread is doing some blocking operation (I/O) then the GIL lock is released and other threads can run safely.
So far that means if we have an instance with 2 workers and 2 threads per worker we will have the following number of threads:
Number of DB threads = Worker count * thread per worker = 2 × 2 = 4
That means for each thread we need to reserve a potential thread from the database connection pool because potentially every application thread might connect to the database at one point depending on the load.
Rails maintains its own database connection pool, with a new pool created for each worker process. Threads within a worker will operate on the same pool.. If a Puma Worker utilizes 5 threads per worker, then the database.yml must be configured to a connection pool of 5, since each thread could possibly establish a database connection.
Since each Worker is spawned by a system fork(), the new worker will have its own set of 5 threads to work with, and thus for the new Rails instance created, the database.yml will still be set to a connection pool of 5.
Rails Database connection pool VS Actual Database Pool
It’s important before moving forward to be able to differentiate between these two as they might confuse a lot of people. In each rails app it maintains its own database connection pool. This is nothing but a pool of threads reserved for the database connection when needed a thread gets picked from the pool, does its job and goes back to be reused again later on. It’s usually by default set by the <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %> This resembles the max threads in a single puma worker. Meaning for every instance of the ruby code invoked we will have a db pool of the thread count specified in this env.
Actual database pool for example the PostgreSQL connection pool (or the connections PostgreSQL accepts) defines how many connections the PostgreSQL server can manage simultaneously across all clients where it limits the total number of concurrent database connections the PostgreSQL server can handle. Controlled in postgresql.conf by the max_connections parameter we can alter it accordingly.
This will come in handy later on so make sure you understand the difference before proceeding.
Sidekiq
Now let’s imagine this scenario.
Postgres
max_connectionsis set to 100We have a rails app operated by Puma web server and the config is as follows:
workersis set to 5max_threadsis set to 20
Knowing this information We have 100 max_connections from Postgres side and 5×20=100 threads potentially connecting to Postgres from the Application side. They are the same count
Now comes in a background job process such as sidekiq. And sidekiq has it’s completely separate configuration regarding concurrency where concurrency is the number of threads for sidekiq to operate on.
Threads in Ruby operate under fundamentally different paradigms, largely due to the Global Interpreter Lock (GIL) in MRI Ruby. In Rails (running under a server like Puma), each thread is responsible for handling one request at a time. There is no true concurrency for CPU-bound tasks because of the GIL in MRI Ruby. Threads in Rails can perform concurrent operations when waiting for I/O-bound tasks (e.g., database queries, external API calls). During such waiting periods, other threads can process requests. A thread finishes one request entirely before moving on to the next.
In Sidekiq the GIL lock is still there but the nature of job processing makes it feel more concurrent. Because job processing might have a lot more I/O (db access, external API calls) it has a lot more context switching between threads than the web server (GIL gets released).
When tuning the postgres max_connections we need to make sure that we take sidekiq into consideration.
The equation becomes as follows:
pg_max_connections = (Puma worker count * RAILS_MAX THREADS) + (Sidekiq concurrency * Sidekiq process count)
Meaning if we have 1 sidekiq process with a concurrency of 5 We need to increase the max_connections from 100 to 105 so each thread can potentially grab a connection to the database.
Summary
When stress testing and scaling a Rails application, it’s crucial to understand how all components work together seamlessly. A lack of database connections leading to 500 errors is one of the worst experiences a user can face, and addressing this should be a top priority. By gaining deeper insights into these mechanisms, you can ensure your application is resilient, scalable, and user-friendly. I hope this article has helped clarify these concepts, and I look forward to sharing more in the next one!
Also subscribe to the YouTube i’ll be doing more than Leetcode over there soon 😄https://www.youtube.com/@techstuffrandom




