-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Problem
osquery has had aarch64 support (osquery/osquery#6612) for a bit. Huge shoutouts to the contributors on that). The big sticking point in declaring it stable, is adding it to CI.
Our last CI was Azure Pipelines, our current CI is GitHub Actions. Unfortunately, neither of these host aarch64 runners. But, they both distribute runners for that platform so you can run your own... (GitHub actions is a fork of Azure Pipelines, so it's unsurprising they look similar)
Possible Solutions
A short link dump, and discussion, about possible solutions
Self Hosted Runner with an Auto Scaling Group
Envoy uses an AWS autoscaling group to manage workers. These workers have some tooling to run a single job, and then detach themselves. This feels very clean, in that it uses a simple AWS tool to handle availability.
References:
Self Hosted Runner in Kubernetes (EKS)
We could host runners as pods in a Kubernetes cluster. This is appealing in it's simplicity, at least once you accept kubernetes.
I think this has some potential drawbacks around security. I don't pods are as isolated as we might like them to be.
There's also a drawback in that we have to bring in kubernetes. I have some experience there (Kolide runs several clusters) but it would be new to the osquery project
References:
- https://sanderknape.com/2020/03/self-hosted-github-actions-runner-kubernetes/
- https://github.com/SanderKnape/github-runner
Self Hosted Runner with Lambda Scaling
Philips uses a pile of terraform to creates lambdas to manage spinning up and down spot instances as workers. This looks pretty well formed, and has some discussion of security. I think it trades the complexity of the Auto Scaling Group for a lambda function.
While I think this is a strong contender, I think it will be simpler for us to use auto scaling groups.
References:
Moving CI
There may be some CI vendors that have native support for aarch64. Amazon's various offerings, travis-ci.
However, moving CI has significant complexity cost to us. We are currently primarily invested in GitHub.
However, if Amazon CodeBuild works well enough, it might be okay to maintain both? Worth at least a little experimenting