Jenkins RSpec CI; Distributing RSpec tests using Jenkins & Knapsack

Hello everyone! today I'm going to give a walkthrough On how to distribute your RSpec tests on several Jenkins agents. If you have lots of tests that need to be run continuously in a CI pipeline whenever you push new code, distributing the tests on several independent nodes is one of the most effective ways to do so.

Luckily since we're using RSpec and most likely using Ruby and Rails, we'll make use of the knapsack gem (free version) and we'll also leverage Jenkins as a CI platform to do so. Our infrastructure will be hosted entirely on AWS and the IaC (Infrastructure as Code) will be using Terraform to deploy and provision all our EC2 instances. Let's give a simple step-by-step on what we'll be doing;

Provisioning the whole infrastructure using AWS and Terraform
Setting up Jenkins and making sure it runs correctly
Installing Jenkins agents (nodes) and making sure we have a cluster of Jenkins nodes ready to be used.
Adding Github credentials to Jenkins
Configuring our RSpec pipeline in Jenkins to do the following:
1. Pull code from the repository
2. Stashing (Jenkin's way of passing files to different agents) the code in all our agents
3. Running RSpec with knapsack distributing the specs as we wish
Summary

This article cost me a lot of time and money 😅, But I'll make sure to briefly explain everything and simplify as much as I can. Let's start with our IaC

Since this might be a long article I'll cut some things I already did many times in my previous articles. I'll make sure to reference everything through

IaC using Terraform & AWS

To get started we'll need to configure our AWS VPN adding private and public subnets. You can configure this as you wish I had one public and one private subnet, We'll provision 3 Jenkins nodes where one of them acts as a master.

The master Jenkins node needs to be on the public subnet because if not you won't be able to access the UI and do anything useful.

I'll mention all the articles where I configured a VPN from scratch and choose as you wish:

https://hewi.blog/deploying-a-simple-web-server-using-terraform-aws (VPN part only)
https://hewi.blog/deploying-an-eks-cluster-using-terraform (VPN part only)

Once you have the VPN set we'll need to provision EC2 instances which will act as our Jenkins nodes. I'm going to be provisioning 3 nodes (1 master and 2 slaves) but feel free to choose as you wish.

To provision an EC2 instance, We'll need the following first:

Network interfaces for our instances (optional)
Security Groups for our instances
Elastic IP (AWS Static IP Address) for the Jenkins master node
AWS Key pair for SSH reasons.

Below is the code for the Jenkins master Security Group


resource "aws_security_group" "jenkins-master" {
  name        = "ec2 sg"
  description = "the ec2 sg"
  vpc_id      = aws_vpc.my-vpc.id

  ingress {
    description      = "ssh"
    from_port        = 22
    to_port          = 22
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  ingress {
    description      = "8080"
    from_port        = 8080
    to_port          = 8080
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  tags = {
    Name = "rds-sg-jnk-master"
  }
}

We allow ports 22 (SSH) and 8080 (Jenkins Web server) only. For the other nodes since they don't have any web server you can omit the 8080 and only have port 22 enabled. It's a must port 22 is enabled in all the nodes' otherwise we won't be able to connect them all later on.

Below is the network interface, elastic ip and keypair for the master node.

resource "aws_network_interface" "master-jenkins" {
  subnet_id   = aws_subnet.public-central-1a.id
  private_ips = ["10.0.2.164"]
  security_groups = [ aws_security_group.jenkins-master.id ]

  tags = {
    Name = "ec2-ni-master"
  }
}

resource "aws_eip" "master-ip" {
  domain                    = "vpc"
  network_interface         = aws_network_interface.master-jenkins.id
  associate_with_private_ip = "10.0.2.164"
  depends_on                = [aws_internet_gateway.gw]
}


resource "aws_key_pair" "deployer" {
  key_name   = "ssh-key"
  public_key = "ssh-keygen"
}

For the network interface, we'll attach the public subnet (which should be created by you), choose an IP address from the subnet and attach the security group too.

Then for the Elastic IP, we give it the IP address and the network interface we just created.

Lastly, we need a key pair for SSHing into our master and slave EC2 instances. To do this just open up a terminal and execute ssh-keygen. This will generate a public-private key pair where the public key will reside on the EC2 instance and you'll be able to SSH to it using the private key generated.

In the aws_key_pair resource add your public key.

Now comes the EC2 pair and there's something that needs to be mentioned before moving on. Choosing the Right instance type for your EC2 instance is crucial. You need to monitor the behavior of your resources and also monitor what happens during the runtime of the tests. This will help you a lot because for example if your applications reside inside docker it's a completely different story. After all, the more containers you have the more memory you'll need. But there are a million different ways to set up your testing environment. For this one I'm just using a docker-compose file with some services but if you ask me if this is a viable real-life scenario I would honestly say no and here's why;

Docker uses a lot of resources and space which won't be cost efficient if you want to provision EC2 instances just for testing. For example, if you have a test database as a container instead of having a container you can spin up the cheapest RDS instance and have it connect to it at test runtime. But also in the end it depends on how your environment and application nature are in general so this might need some thinking from you before moving on.

For this article, I tried several EC2 instances and ended up using c7g.xlarge which has 4 cores and 8 GB RAM. But on-demand costs around 0.145$/hr (~100 $/month) but just for the sake of the article to get things running. Cost efficiency & deciding if this is worth it or not is a decision you should take.

This is the code for the EC2 instance;

resource "aws_instance" "jenkins-master" {
  ami           = "ami-0510240bfdd000cbd"
  instance_type = "c7g.xlarge"
  network_interface {
    network_interface_id = aws_network_interface.master-jenkins.id
    device_index         = 0
  }
  tags = {
    Name = "jenkins master server"
  }
  root_block_device {
    volume_size = 10
  }
  key_name = aws_key_pair.deployer.key_name

It takes an AMI (Amazon Machine Image) which is the base image for the instance, the instance type, network interface and key pair we created earlier

the root block device just modifies the EBS block store (disk size in simple terms) to 10GB because the default was 8 I think.

The slave nodes are the same but just have a different network interface. Same key pair though

To get things running quickly once provisioning the node I added a script that gets executed as soon as the instance gets running. Just add the following to the EC2 instance resource block

  user_data = <<-EOF
              #!/bin/bash
              sudo apt update
              sudo apt -y install openjdk-11-jre
              curl -fsSL https://pkg.jenkins.io/debian/jenkins.io-2023.key | sudo tee /usr/share/keyrings/jenkins-keyring.asc > /dev/null
              echo deb [signed-by=/usr/share/keyrings/jenkins-keyring.asc] https://pkg.jenkins.io/debian binary/ | sudo tee /etc/apt/sources.list.d/jenkins.list > /dev/null
              sudo apt update
              sudo apt-get -y install jenkins
              curl -fsSL https://get.docker.com -o get-docker.sh
              sudo sh get-docker.sh
              sudo usermod -aG docker jenkins
              sudo systemctl restart jenkins
              EOF

This simply installs Jenkins and Docker and adds the Jenkins user (automatically created by jenkins on installation) to the Docker group. If you're not using docker just remove that part from the script.

For the slave nodes, I have a different block

  user_data = <<-EOF
              #!/bin/bash
              sudo apt update
              sudo apt -y install openjdk-11-jre
              curl -fsSL https://get.docker.com -o get-docker.sh
              sudo sh get-docker.sh
              sudo useradd -d /var/lib/jenkins jenkins
              sudo usermod -aG docker jenkins
              mkdir -p /var/lib/jenkins/.ssh
              chown -R jenkins:jenkins /var/lib/jenkins
              chmod 700 /var/lib/jenkins/.ssh
              sudo su jenkins
              cd /var/lib/jenkins/.ssh
              ssh-keygen -t rsa -N "" -f /var/lib/jenkins/.ssh/id_rsa
              cat /var/lib/jenkins/.ssh/id_rsa.pub >> /var/lib/jenkins/.ssh/authorized_keys
              chmod 600 /var/lib/jenkins/.ssh/authorized_keys
              EOF

In the slaves, we need to create the jenkins user with a home directory /var/lib/jenkins. And we need to create a public-private key pair using this jenkins user and add the public key in a file named authorized_keys but why do we do so?

Jenkins connects the nodes using SSH (It's one of many ways) and we'll be using this way to connect all the nodes with the master. Adding the public key to authorized_keys is because when I SSH to a server; that server checks if the private key sent matches one of the public keys in the authorized_keys file. The master Jenkins node sends the private key created in the slaves (we'll have to copy them some way) and the slave checks if there's a public key that matches.

Now all our nodes should have docker and java installed. Only the master node has a Jenkins server. Next, we'll start by visiting our master node's IP Address with port 8080 from our browser to redirect to the Jenkins server

Jenkins setup

I won't go through the setup as it's very straightforward, When everything is setup successfully we need to do the following:

Create our GitHub credential
Add our agents to the master node
Create the Jenkins RSpec pipeline

To create our Jenkins Github credential make sure you have a GitHub fine-grained token created (or you can use SSH if you want) and do the following:

Head to manage Jenkins
Click on credentials -> global -> Add credential

(I chose a username and password in my case). The username is your GitHub username and the password is the fine-grained token you created on GitHub
In the ID field just name it whatever you want but remember its name because we'll use it later on in our pipeline

After that head over to this amazing simple tutorial on how to add Jenkins nodes. It's well explained there's nothing extra I'll be able to add. You'll find that we did the first couple of steps in the script when starting the slave EC2 instance.

Now comes the part where we create the pipeline so in Jenkins add a new pipeline name it whatever you wish and configure your build settings as you wish. The most important thing is the script we'll be using for our pipeline so scroll to the bottom and paste the following:

Note that the code below doesn't have any cleanup that entirely depends on how your environment works. In my case, I only delete the GitHub directory. But since I'm using docker I could've deleted the images or only the images that get built continuously and kept the rest.

pipeline {
    agent any

    stages {
        stage('Checkout') {
            steps {
                checkout([
                    $class: 'GitSCM',
                    branches: [[name: 'without-es']],
                    userRemoteConfigs: [[url: '<repo-url>', credentialsId: '<github-creds-ID>']]
                ])

                stash name: 'source', includes: '**/*'
            }
        }

        stage('Build & Test') {
            steps {
                script{
                    parallel knapsack(2){
                        node {
                            withCleanup{
                                unstash 'source'
                                sh 'cp example.env .env'
                                sh 'echo "CI_NODE_INDEX=$CI_NODE_INDEX" >> .env'
                                sh 'echo "CI_NODE_TOTAL=$CI_NODE_TOTAL" >> .env'
                                sh 'docker compose up --build -d'
                                sh 'docker compose exec web gem install bundler'
                                sh 'docker compose exec web bundle install -j 4'
                                sh 'docker compose exec web bundle exec rails db:migrate'
                                sh 'docker compose exec web bundle exec rake knapsack:rspec'
                                sh 'docker compose down'
                            }
                        }
                    }
                }
            }
        }

    }
}
def withCleanup(Closure cl) {
    deleteDir()
    try {
        cl()
    } finally {
        deleteDir()
    }
}
def knapsack(ci_node_total, cl) {
    def nodes = [:]

    for (int i = 0; i < ci_node_total; i++) {
        def index = i
        nodes["ci_node_${i}"] = {
            withEnv(["CI_NODE_INDEX=$index", "CI_NODE_TOTAL=$ci_node_total"]) {
                cl()
            }
        }
    }

This code mainly has 2 stages I'll explain each accordingly.

The checkout stage is where we pull code from our GitHub repository providing the repository URL and the credential we created earlier (use the Credential ID you added). Then we stash the code. Stash in Jenkins is a convenient way to save a bunch of files or directories and reuse them in different nodes in the same Pipeline run.
Next, we have the Build & Test which can be split into 2 stages but I got lazy 😅, We use a function called knapsack which takes in the number of nodes as an argument and adds two environment variables to them. These ENV variables are used by Knapsack as it uses it to divide the tests across the N nodes provided. CI_NODE_INDEX is the index of the current node and CI_NODE_TOTAL is the total number of nodes provided.
Using the parallel keyword along with node tells Jenkins to execute the commands in N different nodes in parallel.
Then we invoke the WithCleanup which is a wrapper around our code and delete the directory after we finish as a simple cleanup.
The main code adds the environmental variables to our .env file as my rails container takes the .env and uses it as environmental variables inside the container.
Then I proceed to install the bundler, and all the gems, do any needed migrations and finally execute bundle exec rake knapsack:rspec
Doing this every node assigned will execute a portion of the specs, as well as print a report for each node on the status of each run (failures, etc). You can specify which files to execute on which node too. Knapsack has a lot of capabilities I just showed you a very high level of it. If you're interested head over to their official documentation
Finally I down the containers. But as I mentioned lots of cleanups can be made here (deleting images, etc).

Summary

Splitting specs can be very useful in cases where you have hundreds even thousands of them and need to run them before every build. This was a very high-level (just get it working) approach and I'm sure there are a lot of optimizations we can do in the code above. But it was to showcase the power of splitting tests. So if you need to have a CI pipeline with some GitHub hook that triggers it automatically on every push, for example, you can easily do this with Jenkins and Knapsack.

Hope you enjoyed this one as much as I did & if you have any questions I'll be happy to help