How things work#2 - Designing a web server from scratch

Hello guys I'm back with part 2 from the series how things work, if you didn't already check out part 1 from here

To start things off we'll list some functional and non functional requirements from our server

Functional Requirements

Server should be able to accept connections
Server should be able to return the appropriate response
For the case of simplicity we'll return the process id of the process that responds to the request. That brings us to another question; How will we design the server?

We'll use a hybrid approach (again I encourage you to check out the previous blog linked above to understand correctly what I'm about to do) A mix between the Reactor pattern and Pre-Forking. Nginx uses this approach in its web server. It allows it to scale to handle millions of concurrent connections

Non Functional Requirements

The server should be able to handle lots of connections with a fast response.

Creating the bare bones

Let's start off by creating the bare bones of our design. Let's create a file named server.rb which will have all our main server logic inside

The Preforking process is going to be before we dive in the reactor part, the workflow is as follows:

Main server process creates a listening socket.
Main server process forks a configurable number of child processes. [ here is where pre-forking actually happens]
Each child process accepts connections on the shared socket and handles them independently.
Main server process keeps an eye on the child processes.
The kernel actually load balances the incoming connections to the server socket across all the processes that listen on it. They inherit the file descriptors from the parent process hence they all have the same listening socket.
The main server won't accept any connections; the forked processes will.

class PreForking
  CRLF = "/n"
  def initialize(port = 3000)
    @socket = TCPServer.new(port)
  end
  def respond(message)
    @client.write(message)
    @client.write(CRLF)
  end

  def gets
    @client.gets(CRLF)
  end
⬇️

First we add a class called PreForking with a constructor that has a parameter port which is 3000 by default, Every new instance of this class will be a different server, Then 2 methods; respond which takes a message and writes it back to the client then writes the delimiter which is defined as CRLF, This is like telling the server when reading data where to stop reading so for example "I love pizza/nhelloworld" will be split into "I love pizza" and "hello world" because when we read we read streams of data so we need to have some agreed upon value (CRLF) to know where to stop. let's carry on

The ⬇️ at the end of each snippet indicates that the class or scope didn't finish yet; I think personally this approach is better than just adding a big chunk of code

  CONCURRENCY = 4
  def run
    child_pids = []

    CONCURRENCY.times do
      child_pids << spawn_child
    end

    trap(:INT) {
      child_pids.each do |cpid|
        begin
          Process.kill(:INT, cpid)
        rescue Errno::ESRCH
        end
      end

      exit
    }

    loop do
      pid = Process.wait
      $stderr.puts "Process #{pid} quit unexpectedly"

      child_pids.delete(pid)
      child_pids << spawn_child
    end
  end

⬇️ The snippet above simply does 2 things:

Spawn child process
Monitor the child process and if any die respawn them.

We defined a global variable CONCURRENCY of value 4; this indicates that there will be 4 pre-forked processes, so the main server is going to have 4 kids basically.

Then we head to the run method; this method starts spawning child processes via the spawn child method(next part) and adds their ids in an array. It also forwards any INTERRUPT signal to the children and exits, so if you invoke the run method via terminal and press CTRL + C which sends an INTERRUPT signal to the main server, it will forward that signal to kill all of its children.

Finally It enters an infinite loop where it's always blocking and specifically always blocked on the first line with Process.wait being a blocking operation. What it does is basically Is whenever a child process dies it returns the dead child's id. This helps us in knowing if any of the child processes died. Then in prints to stderr that the process died and proceeds to create a new one.

  def spawn_child
    fork do
      loop do
        @client = @socket.accept
        respond Process.pid
        loop do
          request = gets

          if request
            respond Process.pid
          else
            @client.close
            break
          end
        end
      end
    end
  end

The final method in the preforking puzzle is spawn_child; this method forks a process from the main process and goes into an infinite loop, it blocks at socket.accept waiting for a client to connect and once a client connects it returns the clients socket into the client variable, then immediately responds with the process id of the process handling the connection. After that It goes into another loop specifically related to the connection it currently has where it listens for data coming from this client and if the client decides to close the connection (sends a EOF) we close the connection and break out of this loop to start accepting new connections again.

This is all what's necessary to prefork and start the server. But there are a couple of things that we can do better

For each request the child blocks waiting for data from one connection first then after that connection ends the child then proceeds to start accepting new connections, so currently with out design we can have a max of 4 concurrent requests where each process handles a request. Here's where the reactor pattern comes in clutch. Using this pattern we're going to be able to utilize every pre-forked process to the fullest. Each process will be able to listen to not only an ongoing request, but also new connections as well as other accepted connections waiting to write on. Let's see what we can do here

Each pre-forked server will do the following:

The server monitors the listening socket for incoming connections.
Upon receiving a new connection it adds it to the list of sockets to monitor.
The server now monitors the active connection as well as the listening socket.
Upon being notified that the active connection is readable the server reads a chunk of data from that connection and dispatches the relevant callback.
Upon being notified that the active connection is still readable the server reads another chunk and dispatches the callback again.
The server receives another new connection; it adds that to the list of sockets to monitor.
The server is notified that the first connection is ready for writing, so the response is written out on that connection.

We'll add a new class called Connection; this will help us in making each and every new connection separated with it's own methods and variables

class Connection
      CRLF = "\n"
      attr_reader :client

      def initialize(io)
        @client = io
        @request, @response = "", ""
      end

      def on_data(data)
        @request << data

        if @request.end_with?(CRLF)
          # Request is completed.
          respond Process.pid
          @request = ""
        end
      end

      def respond(message)
        @response << message + CRLF

        # Write what can be written immediately,
        # the rest will be retried next time time
        on_writable
      end

      def on_writable
        bytes = client.write_nonblock(@response)
        @response.slice!(0, bytes)
      end

      def monitor_for_reading?
        true
      end

      def monitor_for_writing?
        !(@response.empty?)
      end
    end

I'll try my best to explain this next part, we'll update our spawn_child method to be as follows:

  def spawn_child
    fork do
      @handles = {}
      loop do
        to_read = @handles.values.select(&:monitor_for_reading?).map(&:client)
        to_write = @handles.values.select(&:monitor_for_writing?).map(&:client)
        readables, writables = IO.select(to_read + [@socket], to_write)
         readables.each do |socket|
            if socket = @socket
              io = @socket.accept
              connection = Connection.new(io)
              @handles[io.fileno] = connection

            else
              connection = @handles[socket.fileno]

            begin
              data = socket.read_nonblock(CHUNK_SIZE)
              connection.on_data(data)
            rescue Errno::EAGAIN
            rescue EOFError
              @handles.delete(socket.fileno)
           end
        end
      end
    end
  end

What we do in the function above is the following: We have a local variable handles that has all the connections related to this forked process, we loop over them and check which ones do we need to monitor for reading and writing. Then we proceed to use select(2) which monitors sockets for reading and writing. Nowadays sys calls like epoll(7) and poll(2) are used because they can handle a much bigger amount of sockets than select.

Readables

Select will block until either one socket is available for reading or writing, whenever a socket is available for reading, we check if it's the servers main socket or a client socket. Since the server uses .accept to accept connections it counts as being readable in select(2). If the socket is the server's one we accept and instantiate a new connection object corresponding to the new client connection, and add the socket file descriptor number as a key to the hash handles we created while having the connection as the value. If it's not the main servers socket then it has to be a client socket ready to read from, reading blocks when there is no data being sent by the client but here we proceed to read_nonblock which basically never blocks, what it does is read a chunk of data according to CHUNK_SIZE specified, if it blocks then it'll just fall through the loop nothing special happens (Errno::EAGAIN) is the exception raised when blocked. If it didn't block then it invokes the on_data call sending it the data received to append to the request instance variable and check if it's a full request or all the data haven't been sent yet. If the client closes the connection EOFError we delete the entry from the hash as if the request never happened.

Writeables

Whenever a socket is available and ready to be written to, we just get it from the hash handles we defined and invoke the on_writeable function which basically writes nonblock from the request instance variable, and if it blocks it will slice what it sent from the request variable and fall through waiting to be ready again.

Finalizing

What we did in this article was combine 2 different web server design architectures into a hybrid model which enhances our scalability and allows more requests to come through. This idea was mentioned in the book Working with ruby but I wanted to try code it to see how everything would end up together looking like. I highly recommend this book even if you don't know ruby you'll benefit a lot. If you made it to here thanks for reading this blog and hope you learned something today, till next time!