Behind the scenes of Gitlab

As of late 2023, Gitlab runs in a Docker container on fire. Data backend is in the ZFS dataset hierarchy phlogiston/hosted/gitlab. 🎃 Move to wind-2023.

LB3D: push backup on JuGit https://jugit.fz-juelich.de/iek-11/CompFlu/lb3d

Some cluster nodes…

…are configured as runner instances for the Gitlab CI pipelines, via:

  • /apps/local/setup/gitlab-runner/rc-boot.sh (distributed to all nodes, regardless if gitlab runners or regular slurm nodes, via ql-common NFS as hn-1:/etc/qlustar/common/rc.boot/20-gitlab-runner), but contains the logic to restrict 'gitlab-runner.service' systemd unit to a few hosts.
  • /apps/local/setup/gitlab-runner (on hn-1/home zpool) contains all the backend stuff. Manuel has documented it in /apps/local/setup/gitlab-runner/Readme.md (read online in its Gitlab repo).
  • When logged into Gitlab with master admin credentials, https://git.iek.fz-juelich.de/admin/runners presents you with instructions and the –registration-token needed to C&P into the gitlab-runner register command.
  • Pedestrian's solution to add one node at a time to the pipeline clowd:
    1. Slurm preparation: QluMan is your friend to move the node out of scientific workload partition(s). When in doubt, empty the node beforehand with scontrol update NodeName=sun-36 State=DRAIN Reason=“Gitlab Runner”.
    2. Make hn-1:/home/gitlab-runner/configs/ writable for nobody
    3. Add node in question to whitelist in hn-1:/etc/qlustar/common/rc.boot/20-gitlab-runner
    4. copy registration token from runner admin page
    5. SSH to node in question, register it, and start/enable the runner service via /etc/qlustar/common/rc.boot/20-gitlab-runner
    6. Bring hn-1:/home/gitlab-runner/configs/ and the new runner config back into a nominal state (permissions/ownership)
    7. Check in the runner admin interface if Gitlab has accepted the new runner
  • Distributed scale-out (across worker hosts) works by default; for per-node SMP parallelisation, we need
    1. the parameter concurrent in the hn-1:/home/gitlab-runner/$RUNNER-config.toml to have a value greater than 1
    2. a sufficiently sized tmpfs on the runners. 1 GB/job appears to be the absolute minimum.

The elusive optimum of the concurrent value: 1 is ridiculously slow (occupies too many nodes). 20 is a good trade-off (with two nodes <9 minutes per pipeline as of 2023-03). 30 is too much (24…70 min with one node as of 2023-05). Keep in mind that the bottleneck is not the number of jobs (MPI processes) per node, but rather the network communication between the runner(s) and the server and/or server-internal thingies. This is also why parallel pipelines slow down the execution so ridiculously.

Scenario 2023-01-16: Invite several people to several repos (one with read-only, one with read-write access).

  1. Login to GitLab via privileged account
  2. Groups → Create group
  3. Navigate to your repo(s), and “Members → Invite Group” to the desired “Max role”
  4. Admin area → Users → New user → Enter user credentials (desired name, e-mail address, optional but recommended: Admin note, to have some context on the long run, expiration date). Check “External” if desired. Repeat this step for multiple users.
  5. Admin area → Groups → ”Add user(s) to the group” with maximum necessary privileges.
  • compflu/backstage/gitlab.txt
  • Last modified: 2024-04-15 19:05
  • by j.hielscher