Behind the scenes of Gitlab
Gitlab location
As of late 2023, Gitlab runs in a Docker container on fire
. Data backend is in the ZFS dataset hierarchy phlogiston/hosted/gitlab
. 🎃 Move to wind-2023
.
LB3D: push backup on JuGit https://jugit.fz-juelich.de/iek-11/CompFlu/lb3d
Runners
…are configured as runner instances for the Gitlab CI pipelines, via:
/apps/local/setup/gitlab-runner/rc-boot.sh
(distributed to all nodes, regardless if gitlab runners or regular slurm nodes, via ql-common NFS ashn-1:/etc/qlustar/common/rc.boot/20-gitlab-runner
), but contains the logic to restrict 'gitlab-runner.service' systemd unit to a few hosts./apps/local/setup/gitlab-runner
(onhn-1/home
zpool) contains all the backend stuff. Manuel has documented it in/apps/local/setup/gitlab-runner/Readme.md
(read online in its Gitlab repo).- When logged into Gitlab with master admin credentials, https://git.iek.fz-juelich.de/admin/runners presents you with instructions and the
–registration-token
needed to C&P into thegitlab-runner register
command. - Pedestrian's solution to add one node at a time to the pipeline clowd:
- Slurm preparation: QluMan is your friend to move the node out of scientific workload partition(s). When in doubt, empty the node beforehand with
scontrol update NodeName=sun-36 State=DRAIN Reason=“Gitlab Runner”
. - Make
hn-1:/home/gitlab-runner/configs/
writable fornobody
- Add node in question to whitelist in
hn-1:/etc/qlustar/common/rc.boot/20-gitlab-runner
- copy registration token from runner admin page
- SSH to node in question, register it, and start/enable the runner service via
/etc/qlustar/common/rc.boot/20-gitlab-runner
- Bring
hn-1:/home/gitlab-runner/configs/
and the new runner config back into a nominal state (permissions/ownership) - Check in the runner admin interface if Gitlab has accepted the new runner
- Distributed scale-out (across worker hosts) works by default; for per-node SMP parallelisation, we need
- the parameter
concurrent
in thehn-1:/home/gitlab-runner/$RUNNER-config.toml
to have a value greater than 1 - a sufficiently sized
tmpfs
on the runners. 1 GB/job appears to be the absolute minimum.
The elusive optimum of the concurrent
value: 1 is ridiculously slow (occupies too many nodes). 20 is a good trade-off (with two nodes <9 minutes per pipeline as of 2023-03). 30 is too much (24…70 min with one node as of 2023-05). Keep in mind that the bottleneck is not the number of jobs (MPI processes) per node, but rather the network communication between the runner(s) and the server and/or server-internal thingies. This is also why parallel pipelines slow down the execution so ridiculously.
Multiple external users into one project
Scenario 2023-01-16: Invite several people to several repos (one with read-only, one with read-write access).
- Login to GitLab via privileged account
- Groups → Create group
- Navigate to your repo(s), and “Members → Invite Group” to the desired “Max role”
- Admin area → Users → New user → Enter user credentials (desired name, e-mail address, optional but recommended: Admin note, to have some context on the long run, expiration date). Check “External” if desired. Repeat this step for multiple users.
- Admin area → Groups → ”Add user(s) to the group” with maximum necessary privileges.