dask cluster manager using wireguard and ansible
Go to file
xenia 6fb440c842 fix bug 2021-06-18 03:19:30 -04:00
leylines fix bug 2021-06-18 03:19:30 -04:00
leylines-ansible numba 2021-06-16 22:40:11 -04:00
leylines-bootstrap template bootstrap project 2021-06-14 05:38:51 -04:00
leylines-monocypher add monocypher module 2021-06-14 05:18:32 -04:00
leylines-support add caps 2021-06-18 03:09:47 -04:00
.gitmodules add monocypher module 2021-06-14 05:18:32 -04:00
README.md add upload 2021-06-16 08:39:03 -04:00

README.md

leylines

this repo enables managing a dask cluster using wireguard to link nodes which may be separated by WAN[^1] and includes an opinionated mini wireguard manager (on the server side, workers use wg-quick) that doubles as an ansible inventory plugin. finally, ansible playbooks can run setup and deployment for dask nodes

how to

(cd leylines-monocypher && pip3 install --user .)
(cd leylines && pip3 install --user .)
mkdir -p ~/.config/leylines
leylines init -n myserver -i 1.2.3.4 -k path/to/ssh-key
leylines add -n worker-0 -k path/to/ssh-key
...
leylines add -n worker-n -k path/to/ssh-key

optionally copy the database to your laptop so you can run ansible locally (there will be some actual API soon but not right now)

start a privileged shell (there is no service for the wireguard stuff yet -- coming soon)

systemd-run -tS --uid $(id -u) --gid $(id -g) -pAmbientCapabilities=CAP_NET_ADMIN

sync wireguard settings

leylines sync

get status

leylines status

get config for a node

leylines get-conf <id>

manually copy that config to your worker node, /etc/wireguard/leyline-wg.conf and then systemctl enable --now wg-quick@leyline-wg

currently the wireguard topology is a star. this doesn't actually work optimally for my config, where some nodes are colocated and should have direct connections to each other and others should go over WAN to reach distant nodes. this will be changed (you may be sensing a pattern with the amount of TODO)

run the ansible playbook

cd leylines-ansible
ansible-playbook -i leylines_inv.py playbook-setup.yml

the first run will take a while. it builds python 3.9.5 and installs it, then builds a virtualenv with python dependencies in it, and then installs and starts systemd user services for the scheduler and workers

now you can open <your server's wireguard ip>:31336 to view the dask dashboard

use the cluster with

from dask.distributed import Client
client = Client("<your server's wireguard ip>:31337")

time for magic

copy leylines-support/02-dask.py into ~/.ipython/profile_default/startup

this provides 2 new spells: %dask connects to your cluster, and %daskworker splits off a new ipython console on a worker selected by having free RAM available and not being busy. this is useful for ad-hoc code testing on a real worker

%dask also installs client, a reference to the client, and tqdmprogress, which can be used in place of distributed.diagnostics.progress for a task monitor using tqdm, and upload which uploads a file and returns a delayed function which will fetch the filename on a worker