2021-06-15 10:00:44 +00:00
|
|
|
# leylines
|
|
|
|
|
|
|
|
this repo enables managing a [dask](https://dask.org) cluster using
|
|
|
|
[wireguard](https://www.wireguard.com/) to link nodes which may be separated by WAN[^1] and includes
|
|
|
|
an opinionated mini wireguard manager (on the server side, workers use wg-quick) that doubles as an
|
|
|
|
[ansible](https://www.ansible.com/) inventory plugin. finally, ansible playbooks can run setup and
|
|
|
|
deployment for dask nodes
|
|
|
|
|
|
|
|
## how to
|
|
|
|
|
|
|
|
```bash
|
|
|
|
(cd leylines-monocypher && pip3 install --user .)
|
|
|
|
(cd leylines && pip3 install --user .)
|
|
|
|
mkdir -p ~/.config/leylines
|
|
|
|
leylines init -n myserver -i 1.2.3.4 -k path/to/ssh-key
|
|
|
|
leylines add -n worker-0 -k path/to/ssh-key
|
|
|
|
...
|
|
|
|
leylines add -n worker-n -k path/to/ssh-key
|
|
|
|
```
|
|
|
|
|
|
|
|
optionally copy the database to your laptop so you can run ansible locally (there will be some
|
|
|
|
actual API soon but not right now)
|
|
|
|
|
|
|
|
start a privileged shell (there is no service for the wireguard stuff yet -- coming soon)
|
|
|
|
```bash
|
|
|
|
systemd-run -tS --uid $(id -u) --gid $(id -g) -pAmbientCapabilities=CAP_NET_ADMIN
|
|
|
|
```
|
|
|
|
|
|
|
|
sync wireguard settings
|
|
|
|
```bash
|
|
|
|
leylines sync
|
|
|
|
```
|
|
|
|
|
|
|
|
get status
|
|
|
|
```bash
|
|
|
|
leylines status
|
|
|
|
```
|
|
|
|
|
|
|
|
get config for a node
|
|
|
|
```bash
|
|
|
|
leylines get-conf <id>
|
|
|
|
```
|
|
|
|
|
|
|
|
manually copy that config to your worker node, `/etc/wireguard/leyline-wg.conf` and then
|
|
|
|
`systemctl enable --now wg-quick@leyline-wg`
|
|
|
|
|
|
|
|
currently the wireguard topology is a star. this doesn't actually work optimally for my config,
|
|
|
|
where some nodes are colocated and should have direct connections to each other and others should go
|
|
|
|
over WAN to reach distant nodes. this will be changed (you may be sensing a pattern with the amount
|
|
|
|
of TODO)
|
|
|
|
|
|
|
|
run the ansible playbook
|
|
|
|
```bash
|
|
|
|
cd leylines-ansible
|
|
|
|
ansible-playbook -i leylines_inv.py playbook-setup.yml
|
|
|
|
```
|
|
|
|
|
|
|
|
the first run will take a while. it builds python 3.9.5 and installs it, then builds a virtualenv
|
|
|
|
with python dependencies in it, and then installs and starts systemd user services for the scheduler
|
|
|
|
and workers
|
|
|
|
|
|
|
|
now you can open `<your server's wireguard ip>:31336` to view the dask dashboard
|
|
|
|
|
|
|
|
use the cluster with
|
|
|
|
```python
|
|
|
|
from dask.distributed import Client
|
|
|
|
client = Client("<your server's wireguard ip>:31337")
|
|
|
|
```
|
2021-06-16 12:39:03 +00:00
|
|
|
|
|
|
|
### time for magic
|
|
|
|
|
|
|
|
copy `leylines-support/02-dask.py` into `~/.ipython/profile_default/startup`
|
|
|
|
|
|
|
|
this provides 2 new spells: `%dask` connects to your cluster, and `%daskworker` splits off a new
|
|
|
|
ipython console on a worker selected by having free RAM available and not being busy. this is useful
|
|
|
|
for ad-hoc code testing on a real worker
|
|
|
|
|
|
|
|
%dask also installs `client`, a reference to the client, and `tqdmprogress`, which can be used in
|
|
|
|
place of `distributed.diagnostics.progress` for a task monitor using `tqdm`, and `upload` which
|
|
|
|
uploads a file and returns a delayed function which will fetch the filename on a worker
|