Skip to content

CI Support for aarch64 (AWS graviton2) #78

@directionless

Description

@directionless

Problem

osquery has had aarch64 support (osquery/osquery#6612) for a bit. Huge shoutouts to the contributors on that). The big sticking point in declaring it stable, is adding it to CI.

Our last CI was Azure Pipelines, our current CI is GitHub Actions. Unfortunately, neither of these host aarch64 runners. But, they both distribute runners for that platform so you can run your own... (GitHub actions is a fork of Azure Pipelines, so it's unsurprising they look similar)

Possible Solutions

A short link dump, and discussion, about possible solutions

Self Hosted Runner with an Auto Scaling Group

Envoy uses an AWS autoscaling group to manage workers. These workers have some tooling to run a single job, and then detach themselves. This feels very clean, in that it uses a simple AWS tool to handle availability.

References:

Self Hosted Runner in Kubernetes (EKS)

We could host runners as pods in a Kubernetes cluster. This is appealing in it's simplicity, at least once you accept kubernetes.

I think this has some potential drawbacks around security. I don't pods are as isolated as we might like them to be.

There's also a drawback in that we have to bring in kubernetes. I have some experience there (Kolide runs several clusters) but it would be new to the osquery project

References:

Self Hosted Runner with Lambda Scaling

Philips uses a pile of terraform to creates lambdas to manage spinning up and down spot instances as workers. This looks pretty well formed, and has some discussion of security. I think it trades the complexity of the Auto Scaling Group for a lambda function.

While I think this is a strong contender, I think it will be simpler for us to use auto scaling groups.

References:

Moving CI

There may be some CI vendors that have native support for aarch64. Amazon's various offerings, travis-ci.

However, moving CI has significant complexity cost to us. We are currently primarily invested in GitHub.

However, if Amazon CodeBuild works well enough, it might be okay to maintain both? Worth at least a little experimenting

Metadata

Metadata

Assignees

No one assigned

    Labels

    moving partsThis involved infra, accounts, or services we need to manage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions