dask cluster manager using wireguard and ansible
Go to file
xenia 29be9a49e8 fix bug in main 2021-06-16 00:28:21 -04:00
leylines fix bug in main 2021-06-16 00:28:21 -04:00
leylines-ansible more fixes 2021-06-15 05:46:27 -04:00
leylines-bootstrap template bootstrap project 2021-06-14 05:38:51 -04:00
leylines-monocypher add monocypher module 2021-06-14 05:18:32 -04:00
.gitmodules add monocypher module 2021-06-14 05:18:32 -04:00
README.md readme 2021-06-15 06:00:44 -04:00

README.md

leylines

this repo enables managing a dask cluster using wireguard to link nodes which may be separated by WAN[^1] and includes an opinionated mini wireguard manager (on the server side, workers use wg-quick) that doubles as an ansible inventory plugin. finally, ansible playbooks can run setup and deployment for dask nodes

how to

(cd leylines-monocypher && pip3 install --user .)
(cd leylines && pip3 install --user .)
mkdir -p ~/.config/leylines
leylines init -n myserver -i 1.2.3.4 -k path/to/ssh-key
leylines add -n worker-0 -k path/to/ssh-key
...
leylines add -n worker-n -k path/to/ssh-key

optionally copy the database to your laptop so you can run ansible locally (there will be some actual API soon but not right now)

start a privileged shell (there is no service for the wireguard stuff yet -- coming soon)

systemd-run -tS --uid $(id -u) --gid $(id -g) -pAmbientCapabilities=CAP_NET_ADMIN

sync wireguard settings

leylines sync

get status

leylines status

get config for a node

leylines get-conf <id>

manually copy that config to your worker node, /etc/wireguard/leyline-wg.conf and then systemctl enable --now wg-quick@leyline-wg

currently the wireguard topology is a star. this doesn't actually work optimally for my config, where some nodes are colocated and should have direct connections to each other and others should go over WAN to reach distant nodes. this will be changed (you may be sensing a pattern with the amount of TODO)

run the ansible playbook

cd leylines-ansible
ansible-playbook -i leylines_inv.py playbook-setup.yml

the first run will take a while. it builds python 3.9.5 and installs it, then builds a virtualenv with python dependencies in it, and then installs and starts systemd user services for the scheduler and workers

now you can open <your server's wireguard ip>:31336 to view the dask dashboard

use the cluster with

from dask.distributed import Client
client = Client("<your server's wireguard ip>:31337")