GitLab CI

GitLab CI and Kubernetes. What's not to like?

Overview

History

The original GitLab CI I used was a hastily concocted GitLab Docker shell executor. I think. (My understanding of what it was doing should be evident by now.)

What I didn't really grasp, nor do I now, is what that shell executor is really doing. It somehow manages to utilise all of the tools installed on the host system it is running on, ie. doesn't appear to be (obviously) a container per se.

It also re-extracted and rebuilt the codebase in the same working area. This, in particular, required a make clean to ensure that I wasn't tripping over myself.

The CI Jobs

Clarifying what the CI is doing, and picking up a couple of extends and needs tricks means that the CI jobs can become:

a "regular" make test dependent on a "regular" make

This is more or less what I do in code development.
a not-so-regular make test dependent on a make coverage

Subsequently, post-make test we can gather the code coverage stats.

Kubernetes Approach

Kubernetes sees this a four separate tasks and runs them on completely new Pods.

This is the initial reason why we built the CI Build image as Kubernetes would happily pull an Ubuntu image and then re-install all the necessary components for each. For all four jobs. For every commit. sigh

The other issue with job independence is that we now have to tell the CI system what artifacts we need to keep from from the one job to be deployed in the next. Luckily, the Idio build moved to installing all the built items into a .local hierarchy so we can just ask the CI system to capture that.

GitLab CI

GitLab also needs to be corralled.

To discover all this you don't really want to be hoiking a large code base around and GitLab allows you to share CI configurations between groups of projects.

Hence we can create a new project in the same group hierarchy as our main codebase in which we can test some simple example through the previous job stages.

GitLab

GitLab wants its GitLab Agent to be running in Kubernetes. That seems reasonable. Slightly confusing the issue is that GitLab now needs the GitLab Agent Server (careful, not the GitLab Agent) running to be able to talk to the GitLab Agent (running in Kubernetes -- please keep up).

Oh, the GitLab Agent Server is still called KAS after its old name (GitLab Kubernetes Agent Server?).

The GitLab Agent is not a GitLab Runner which is what we want for CI. The Agent is all-purpose and we'll want to use the Agent to create Runner(s).

KAS

KAS is easy enough to enable:

# vi /etc/gitlab/gitlab.rb
gitlab_kas['enable'] = true

# gitlab-ctl reconfigure
# gitlab-ctl restart

Warning

Some time later my jobs stopped working. After some head-scratching:

# tail /var/log/gitlab/nginx/gitlab_error.log
... [error] 1044#0: *200984 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.c.d, server: gitlab.example.com, request: "GET /-/kubernetes-agent/ HTTP/1.1", upstream: "http://[::1]:8150/", host: "gitlab.example.com"

and:

# ss -ntlp sport = :8150
State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port Process
LISTEN  0       1024    127.0.0.1:8150      0.0.0.0:*         users:(("gitlab-kas",pid=2203852,fd=8))

and:

# grep 8150 /etc/gitlab/gitlab.rb
# gitlab_kas['listen_address'] = 'localhost:8150'

Uh. Two lookups of localhost and two different answers, I guess.

However, this is a distraction. My Pods had restarted and therefore had re-registered as new Runners so you need to go through the Unlock and Enable Projects malarkey, below.

gitlab.rb out of date

When I came to make that edit, I appeared to be missing some gitlab.rb configuration, per the suggestion that I might want to be updating some of the other KAS settings. What other settings?

It turns out that when you upgrade GitLab it never updates gitlab.rb. You can see how out of date you are with:

# gitlab-ctl diff-config

which revealed I was missing lots of configuration elements. I don't (need to) set any of them, I just use the defaults, but maybe I should be setting them in which case I ought to know what they are.

gitlab.rb contains a link to download the latest version and I discovered that sdiff has an interactive yay/nay option (who knew?):

# sdiff -o gitlab.rb.merged gitlab.rb gitlab.rb.latest

Slightly annoyingly, gitlab-ctl diff-config always shows some differences over and above the changes you have made. A feature.

HTTPS

It will shortly transpire that Kubernetes will only talk to an HTTPS-enabled GitLab -- or, more accurately, Kubernetes will refuse any non-HTTPS request.

My GitLab instance is deliberately internal so I never bothered. Now we can bother:

# vi /etc/gitlab/gitlab.rb
external_url 'https://gitlab.example.com'

nginx['redirect_http_to_https'] = true

The code is going to look for /etc/gitlab/ssl/$(uname -n).{crt,key}.

Not only that, the Agent is going to look at SANs and not the CommonName of the certificate. Your browser will look at the CN (and any SANs) but the Agent will only look at the SAN (broken up for readability):

({... failed to send handshake request:
      Get \\\"https://gitlab.example.com/-/kubernetes-agent/\\\":
      x509: certificate relies on legacy Common Name field, use SANs instead\""})

Let's see if we can cobble that together:

# mkdir /etc/gitlab/ssl

# openssl req -newkey rsa:4096 -x509 -sha512 -days 3650 -nodes \
    -out /etc/gitlab/ssl/$(uname -n).crt \
    -keyout /etc/gitlab/ssl/$(uname -n).key
    -subj "/O=Run Scripts ltd/CN=$(uname -n)/" \
    -addext "subjectAltName = DNS:$(uname -n)"

We'll also need to ensure that GitLab knows its own Certificate is trusted:

# cp /etc/gitlab/ssl/$(uname -n).crt /etc/gitlab/trusted-certs

plus:

# gitlab-ctl reconfigure
# gitlab-ctl restart

of course.

GitLab Agent

The wording is slightly squirrelly: the Agent can only be induced (I can't think of a better word) if there is a .gitlab/agents/$AGENT/config.yaml (that's .yaml, not .yml) in a repository.

So, not necessarily the repository you're targeting but, for the sake of argument, a new repository we're creating to figure out how this whole thing works.

The file doesn't even need to contain anything, it just needs to exist -- and be checked in and pushed to GitLab, obviously.

$AGENT is my Agent's name which has some naming rules but obvious simple stuff works.

The location of the GitLab Project containing this config.yaml might need some consideration. You can share the Agent with other Projects and/or other Groups. Given this freedom to share it's not clear whether the location of the Project (in the Groups hierarchy) has any effect.

In the meanwhile, I've created the new Project, k8s-agent, in the same Group hierarchy as my intended Project.

I've then created an empty .gitlab/agents/idio/config.yaml -- as it's all about my Idio project.

Register the Agent

In our new Project:

follow the menu items Infrastructure > Kubernetes clusters
hit Actions
click the Select an agent dropdown and choose Idio
Register

This will print a token and a command to run, something like docker [args] | kubectl apply -f -

Here, we hit a dreadful presumption, that we have the given command, docker available to us. Hmm, we do somewhere else completely but fortunately the image that is going to be run only exists to generate a YAML file for kubectl. So we can run the docker part anywhere -- just not with the | kubectl bit.

The YAML file is printed to stdout, of course. It's only 60-odd lines so you can cut'n'paste it.

Trusting GitLab

When you try to use the YAML back on Kubernetes:

# kubectl apply -f docker-output.yaml

we'll get a complaint about an unknown Certificate (GitLab's).

You could re-run the docker command with --help to get some clues but eventually I found my way to https://docs.gitlab.com/ee/user/clusters/agent/troubleshooting.html from which we discern that GitLab's Certificate needs to be in a ConfigMap and suitably mapped into the YAML the docker command generated.

On GitLab:

gitlab# cd /etc/gitlab/ssl
gitlab# scp $(uname -n).crt k8s-m1:$(uname -n).pem

where we've renamed the (by default PEM format) .crt into a .pem file. Obviously, you may need to reformat the file if the .crt is not in PEM format. Ultimately, Kubernetes wants a PEM format file and the filenames must line up.

Then on Kubernetes (assuming you did run the YAML once which will, at least, create the NameSpace we're about to use):

k8s-m1# kubectl -n gitlab-kubernetes-agent create configmap gitlab-ca-pemstore \
          --from-file=gitlab.example.com.pem

at which point dump it back out as YAML so it can be added to the docker YAML:

k8s-m1# kubectl -n gitlab-kubernetes-agent get configmap/gitlab-ca-pemstore -o yaml

and the docker YAML needs something like:

spec:
  serviceAccountName: gitlab-kubernetes-agent
  containers:
  - name: agent
    image: "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:stable"
    args:
    - --token-file=/config/token
    - --kas-address
    - wss://gitlab.example.com/-/kubernetes-agent/
    volumeMounts:
    - name: token-volume
      mountPath: /config
    - name: ca-pemstore-volume
      mountPath: /etc/ssl/certs/gitlab.example.com.pem
      subPath: gitlab.example.com.pem
  volumes:
  - name: token-volume
    secret:
      secretName: gitlab-agent-token-XXXX
  - name: ca-pemstore-volume
    configMap:
      name: gitlab-ca-pemstore
      items:
      - key: gitlab.example.com.pem
        path: gitlab.example.com.pem

Configure the Agent

The GitLab Agent should now be running (don't get excited, we're only half-way there) and we can take this opportunity to configure the Agent to be allowed to run for other Projects and/or Groups.

There's some notes in https://gitlab.office.soho/help/user/clusters/agent/repository.md and https://gitlab.office.soho/help/user/clusters/agent/ci_cd_tunnel.md.

For this we can edit config.yaml and add something like:

ci_access:
  groups:
  - id: GitLab-Group-Name

for your GitLab-Group-Name.

GitLab Runner

We have a GitLab Agent running in Kubernetes, we now need to have the Agent run some GitLab Runners (in Kubernetes, obviously).

Helm

I have had Helm described to me as the package manager for Kubernetes. That's all I've got.

(Which meant the following was a struggle.)

# def install helm
# helm repo add gitlab https://charts.gitlab.io
# helm repo update

How did I know https://charts.gitlab.io exists and is what we want? Too much Googling.

Configuration Two-Step

We'll need the GitLab Runner Registration token next:

GitLab menu items Settings > CI/CD
expand Runners

Then create a stage 1 config file:

#  cat <<EOF > runner-chart-values.yaml
# The GitLab Server URL (with protocol) that you want to register the runner against
# ref: https://docs.gitlab.com/runner/commands/index.html#gitlab-runner-register
#
gitlabUrl: https://gitlab.example.com/

# The registration token for adding new runners to the GitLab server
# Retrieve this value from your GitLab instance
# For more info: https://docs.gitlab.com/ee/ci/runners/index.html
#
runnerRegistrationToken: "your token here"

# For RBAC support:
rbac:
    create: true

# Run all containers with the privileged flag enabled
# This flag allows the docker:dind image to run if you need to run Docker commands
# Read the docs before turning this on:
# https://docs.gitlab.com/runner/executors/kubernetes.html#using-dockerdind
runners:
    privileged: true

EOF

Before we go on, we need to consider the Kubernetes NameSpace that the GitLab Runners are going to operate in. Here, we'll use the suggested gitlab but maybe gitlab-ci might be better.

First, use helm to create a stage 2 YAML file:

# helm template --namespace gitlab gitlab-runner \
    -f runner-chart-values.yaml gitlab/gitlab-runner > runner-manifest.yaml

Second, heed the notice:

Edit the runner-manifest.yaml file to include the namespace for every resource. The output of helm template doesn’t include the namespace in the generated resources

Really?

So, edit runner-manifest.yaml to add to the metadata of each Kind:

namespace: "gitlab"

You might want to change the number of replicas -- becoming the number of Runners, of course.

As it turns out, nothing has actually created the NameSpace anyway, so add another section at the top of the YAML:

apiVersion: v1
kind: Namespace
metadata:
  name: gitlab
---

And we now need to replicate the Certificate mapping we did earlier with volumeMount, volumes and creating the ConfigMap in the NameSpace.

Finally, we ought to be able to use the YAML:

# kubectl apply -f runner-manifest.yaml

Assuming everything has gone to plan we should now have some GitLab Runners running in Kubernetes.

Private Docker Registry

If you created a private docker registry then you'll need to add the reg-cred-secret to this gitlab NameSpace as well.

You'll also want to add the imagePullSecrets value. There appear to be a couple of ways of doing this: https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runnerskubernetes-section. I've gone with editing the YAML (again) and changing the config.template.toml section which manages to embed some TOML in some YAML:

config.template.toml:   |
  [[runners]]
    [runners.kubernetes]
      namespace = "gitlab"
      image = "ubuntu:20.04"
      image_pull_secrets = [ "reg-cred-secret" ]

Where I'm defaulting the CI image to ubuntu:20.04 -- which I promptly override in the .gitlab-ci.yml -- and set image_pull_secrets the TOML for the YAML's imagePullSecrets.

Unlock and Enable Projects

GitLab appears to have decided that these new Runners should be added to my historic Runners and are attached to my Idio Project. So, not the Project I put the config.yaml in but another Project.

The other thing it appears to do is lock those Runners to the Project it decided to add them to.

So we need to do two things:

find the Project with the Runners, edit the Runner and uncheck the locked to project box.

You might also want to add any appropriate tags that suit your existing CI.
go to the project you want the Runners to be added to and click the enable for this project button.

If you don't see an enable button then maybe you've not edited the ci_access section.

GitLab CI

Finally, we can get around to actually doing some CI.

Idio's .gitlab-ci.yml now looks like:

image: docker-registry:5000/idio-ci-image:latest

stages:
- compile
- test

compile-regular:
  tags:
    - ubuntu-20-shell
  stage: compile
  script:
    - env | sort
    - make
  artifacts:
    paths:
    - .local

compile-coverage:
  extends:
  - compile-regular
  script:
    - make coverage

test-regular:
  tags:
    - ubuntu-20-shell
  stage: test
  needs:
  - compile-regular
  script:
    - .local/bin/idio --version
    - ./utils/forced-tty-session .local/bin/idio test

test-coverage:
  extends:
  - test-regular
  script:
    - utils/forced-tty-session .local/bin/idio test
    - gcovr --xml-pretty --exclude-unreachable-branches --print-summary -o coverage.xml --root ${CI_PROJECT_DIR}
  coverage: /^\s*lines:\s*\d+.\d+\%/
  artifacts:
    name: ${CI_JOB_NAME}-${CI_COMMIT_REF_NAME}-${CI_COMMIT_SHA}
    expire_in: 2 days
    reports:
      cobertura: coverage.xml

include:
  - template: Security/SAST.gitlab-ci.yml

sast:
  tags:
  - docker
  stage: test

where we start off by using our idio-ci-image from our private docker registry.

We then have the Idio C components, compile and test which come in the two variants for "regular" and "coverage", all of which are tagged to run on an ubuntu-20-shell Runner and also some GitLab suggested sast (Static Application Security Testing) which run on a docker Runner.

Troubleshooting

As mentioned before with KAS, if the Runners restart -- perhaps the worker node was rebooted -- they will reregister with GitLab with a new ID which means you need to run through the whole Unlock and Enable Projects palaver.

Of course, if you were to kubectl cordon then kubectl drain a worker node then Pods will be restarted and therefore Runners will be reregistered. Maybe It Is the Way.

Document Actions