The Journey to self-hosted Github Actions

Branden Pleines
9 min readSep 15, 2022

--

Problem Statement

Any enterprise software company today relies on automated pipelines to test, build, and ship new features into production. Some of the most commonly used platforms today include Jenkins, CircleCI, and Azure Devops. In the most primitive state, companies may even be using custom scripts and frameworks to get the job done. At scale, Continuous Integration and Continuous Deployment can be the bane of a developer’s existence. Why did my build break ? Is Jenkins acting up again? These commonly raised questions delay critical product features; features that can make or break your key customer relationships.

Angry Jenkins! Jokes aside, Jenkins is still the leading open-source CI/CD platform used by many prominent software companies

Overview

Loved by enterprises and open source projects alike, Github is the ubiquitous source control platform. Github Actions, Github’s relatively new CI/CD platform has exploded onto the market, quickly emerging as a state-of-the-art platform. Today we’ll cover:

  • Core considerations for migrating workloads to Github Actions
  • Existing self-hosted Github Actions Runner open-source projects
  • Hands-on ways to keep your Github Action Workflows short and clean
  • Sprinkles of Continuous Integration best practices
Source https://github.com/actions

Github hosted runners

Github hosted runners work wonderfully and are highly reliable. At the time of writing there is one major architectural consideration/pitfall.

Hardware specification for Windows and Linux virtual machines:
* 2-core CPU (x86_64)
* 7 GB of RAM
* 14 GB of SSD space

These resource constraints are just fine for many simple CI use cases such as linting. However for testing more cpu and memory intensive applications, these hardware specs don’t make the cut. Github documents a beta feature for larger runner instances, but right now the only option for non-enterprise users is to implement self-hosted runners to fill the gap. On top of this primary limitation, you rely on the software that Github hosted runners have preinstalled. Github does a solid jump of preinstalling the usual suspects, but what’s needed can vary greatly between use cases. You’re able to install additional software at build time, but I’ll touch on why this can be considered a CI/CD anti-pattern a bit later.

I want to be clear: Github hosted actions are the easiest way to get started with Github Actions. However once CI matures, you may be looking for more efficient and/or stable ways to get product features out the door.

self-hosted Github Action Runners

While you are free to roll your own infrastructure deriving from the upstream Github Actions runner project, I have personally had tremendous success with two open source projects that wrap this upstream project in elegant designs reflecting modern infrastructure: 1) actions-runner-controller and 2) phillips-labs/terraform-aws-github-runner. I’ll briefly discuss each followed by the possible need to have both. These also happen to be Github’s recommendations for autoscaling self-hosted runner infrastructure.

containerize your builds!

The actions-runner-controller project is ideal for companies heavily invested in Kubernetes. Kubernetes has been the clear winner in the container orchestration space, and this project excels in scaling build infrastructure to meet the elastic demands of your CI/CD by leveraging the core autoscaling capabilities of Kubernetes. You should always be asking: can my build run in a container? Doing so has a plethora of advantages mostly stemming from the core principals of immutable infrastructure. Have you ever seen the following in a build pipeline?

DEBIAN_FRONTEND=noninteractive sudo apt-get install -y --no-recommends please dont ever do this and if you do keep reading

Yuck! Not only are you lengthening your build times up to several minutes, but also invite build instability due to apt repository mirrors being down. Choosing the latest of everything doesn’t usually matter until it does. And when it does, you’ll be spending a lot of time figuring out why. Building in containers facilitated by the actions-runner-controller project (any CI environment really) traps all of these unhappy paths in immutable infrastructure, so your workflows build the same way, every time.

actions-runner-controller

The actions-runner-controller installs custom resource definitions in your Kubernetes cluster that allows it to interact with Github Actions. Kubernetes pods are created and destroyed relative to the number of queued builds. This project offers immense flexibility on how pods and their underlying containers scale to meet the highly elastic nature of Kubernetes. RunnerDeployments, the action-runner-controller’s abstraction of Kubernetes deployments, can be configured on a per repository or per Github organization basis: the choice largely reflects the needs of your organization.

Source: https://microplatforms.de/17823346/configure-actions-runner-controller-with-proxy-in-private-eks-cluster#/

But what if your organization relies heavily on docker-compose for builds? Running dockeror docker-composedirectly in a kubernetes pod likely won’t work, so you’ll need a linux virtual machine to spin up various containers onto it. Enter phillips-labs/terraform-aws-github-runner.

phillips-labs/terraform-aws-github-runner

This project assumes knowledge of terraform and use of AWS, but it handles the provisioning of the entire infrastructure stack captured below with the usual model of terraform apply . A Github App secures the API Gateway and associated Webhook. It also makes provisioned runner instances available to Github Actions. self-hosted runners in public repositories are not recommended.

Source: https://github.com/philips-labs/terraform-aws-github-runner

A final note on both of these projects. At the time of implementing neither project has ephemerality defined as the default, but both projects support ephemeral configuration. These allow dynamically provisioned infrastructure to execute build steps and gracefully terminate — fresh build environments every time. Except in rare cases, ephemeral build resources are the move. Doing so achieves a lot of things, but primarily avoids lingering files, containers, or services running in workflow instance #1 from breaking a build in workflow instance #2. You’ve been warned ;). I haven’t met anyone that likes paying for idle build servers either.

self-hosted Github Actions labels

We’ve illustrated multiple projects which enable self-host Github Action runners based on the needs of your CI/CD. How do you select them in worklows? Pop quiz: say you implement both projects described above for a single repository — what happens in the below workflow?

---
name: demo-github-actions-workflow
on:
pull_request:
jobs:
Demo-Job:
runs-on: [self-hosted]
steps:
- uses: actions/checkout@v3

Answer: whichever runner registers with Github first!

Labels attached to self-hosted runners allow the correct self-hosted runner to get selected based on a workflow definition. Both projects allow for additional labels to be added. Let’s say we add the actions-runner-controller label and terraform-aws-github-runner label to each project respectively. Now we can specify these in the runs-on declaration to ensure there aren’t any surprises:

kubernetes:

...runs-on: [self-hosted, actions-runner-controller]
...

ec2:

...runs-on: [self-hosted, terraform-aws-github-runner]
...

Writing Better Github Actions

Creating composite actions

Because all pipeline-as-code CI/CD frameworks are software projects, they should be treated as such. Code deduplication and readability become important considerations for larger, more complex workflows. Consider the following (fairly contrived) Github Actions workflow:

---
name: ninja-github-actions-workflow
on:
pull_request:
jobs:
Ninja:
runs-on: [self-hosted]
steps:
- uses: actions/checkout@v3
- runs: |
ninja main | ts '[%m-%d %H:%M:%S]' ninja.log
ccache -s
- runs: |
ninja projecta | ts '[%m-%d %H:M:%S]' ninja.log
ccache -s
- runs: |
ninja projectb | ts '[%m-%d %H:%M:%S]' ninja.log
ccache -s

By writing a composite action that lives in the .github/actions/ninja/action.yaml directory, your workflow can instead look like the following:

---
name: ninja-github-actions-workflow
on:
pull_request:
jobs:
Ninja:
runs-on: [self-hosted]
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/ninja
- uses: ./.github/actions/ninja
with:
target: projecta
- uses: ./.github/actions/ninja
with:
target: projectb

Corresponding composite action .github/actions/ninja/action.yaml:

---
name: ninja
description: ninja
inputs:
target:
description: target
default: main
runs:
using: composite
steps:
- run: |
ninja ${{ inputs.target }} | ts '[%m-%d %H:%M:%S]' ninja.log
ccache -s
shell: bash

In this short example, not a ton is obviously gained. Now imagine several workflow step splattered with hundreds of lines of bash. Any project with 20+ contributors and I can guarantee the workflow file will change constantly. By isolating any multiline bash as a composite action that lives within your repository, workflow file changes reflect structural changes and not minor code tweaks. I also intentionally added a bug to our original ninja-github-actions-workflow— did you catch it? One core idea behind abstracting repetitive code in functions is to reduce human issues stemming from the dreaded copy pasta. Implementing composite actions embraces the same ideology.

Supporting automated and workflow dispatch triggers simultaneously

In most cases, automated triggers are used for Continuous Integration. Github Actions also provides a robust mechanism to trigger workflows manually called workflow dispatch. This opens up a lot of doors to use Github Actions as a mechanism for Continuous Deployment. Consider the following workflow:

---
name: terraform-github-actions-workflow
on:
push:
branches: main
workflow_dispatch:
inputs:
terraform_action:
description:
which terraform action to apply
default:
plan
jobs:
Terraform:
name: terraform ${{ inputs.terraform_action }}
runs-on: [self-hosted]
steps:
- uses: actions/checkout@v3
- run: terraform ${{ inputs.terraform_action }}

We’ve now enabled this workflow to be run manually outside of commits to main, but there is one catch. In the case of a push to main, this workflow will actually error because ${{ inputs.terraform_action }} is an empty string. When combining automated and manual triggers, you can use || to supply a default to a workflow step in the case it doesn’t receive the value from Workflow Dispatch:

---
name: terraform-github-actions-workflow
on:
push:
branches: main
workflow_dispatch:
inputs:
terraform_action:
description:
which terraform action to apply
default:
plan
jobs:
Terraform:
name: terraform ${{ inputs.terraform_action || 'apply' }}
runs-on: [self-hosted]
steps:
- uses: actions/checkout@v3
- run: terraform ${{ inputs.terraform_action || 'apply' }}

It’s important to note there are other ways to achieve this such as using reusable workflows, but sometimes having everything in one workflow file is best.

Workflow dispatch has a baked in ability to run according to the definition of a particular branch. You can do this through the Github UI -> Actions tab or by using the Github CLI:

gh workflow run terraform-github-actions-workflow

gh workflow run terraform-github-actions-workflow --ref sweet_new_feature

The “collector” Job problem

At the moment, Github does not support using an entire workflow as a CI Gate — individual Jobs must be marked as Required in branch protection rules. Interestingly the Github -> CircleCI integration does allow this, but I digress. In a complex repository with several RequiredCI gates, this becomes a maintenance issue. Every time someone adds a new Required Job in the branch protection rules, everyone must rebase their active Pull Requests so the new Required Job runs. Not as big of a deal if your CI takes less than a minute, but the pain is felt for workflows that take 20+ minutes.

As such, it can be tempting to implement what’s called a collector Job which pulls in the statuses of all required CI jobs, now defined in code rather than branch protection rules.

If you’re listening Github, we really want the feature to mark an entire Github Action Workflow as a RequiredCI Gate in branch protection rules.

Summary

Github Actions is a remarkable platform taking the industry by storm. Moving to Github Actions allows teams to use Github as a one-stop-shop for source control, Continuous Integration, and Continuous Deployment. The flexibility of Github Action triggers allows for a wide range of automated workflows keeping your developers busy and product owners happy.

I hope you learned a few tips or tricks. Thanks for reading!

Source: github.com

--

--

Branden Pleines
Branden Pleines

No responses yet