Starbeamrainbowlabs

Stardust
Blog

Cluster, Part 1: Answers only lead to more questions

At home, I have a Raspberry Pi 3B+ as a home file server. Lately though, I've been noticing that I've been starting to grow out of it (both in terms of compute capacity and storage) - so I've decided to get thinking early about what I can do about it.

I thought of 2 different options pretty quickly:

While both of these options are perfectly viable and would serve my needs well, one of them is distinctly more interesting than the other - that being a cluster. While having a 'proper' server would be much simpler, perhaps slightly more power efficient (though I would need tests to confirm, since ARM - the CPU architecture I'm planning on using for the cluster - is more power efficient), and would put all my system resources on the same box, I like the idea of building a cluster for a number of reasons.

For one, I'll learn new skills setting it up and managing it. So far, I've been mostly managing my servers by hand. When you start to acquire a number of machines though, this quickly becomes unwieldy. I'd like to experiment with a containerisation technology (I'm not sure which one yet) and play around with distributing containers across hosts - and auto-restarting them on a different host if 1 host goes down. If this is decentralised, even better!

For another, having a single larger server is a single point of failure - which would be relatively expensive to replace. If I use lots of small machines instead, then if 1 dies then not only is it cheaper to replace, but it's also not as urgent since the other machines in the cluster can take over while I order a replacement.

Finally, having a cluster is just cool. Do we really need more of a reason than this?

With all this in mind, I've been thinking quite a bit about the architecture of such a cluster. I haven't bought anything yet (and probably won't for a while yet) - because as you may have guessed from the title of this post I've been running into a number of issues that all need researching.

First though let's talk about which machines I'm planning on using. I'm actually considering 2 clusters, to solve 2 different issues: compute and storage. Compute refers to running applications (e.g. Nextcloud etc), and storage refers to a distributed storage mechanism with multiple hosts - each with 1 drive attached - though I'm unsure about the storage cluster at this stage.

For the compute cluster, I'm leaning towards 4 x Raspberry Pi 4 with 4GiB of RAM each. For the storage cluster, I'm considering a number of different boards. 3 identical boards of 1 of the following:

I do seem to remember a board that had USB 3 onboard, which would be useful for connecting to the external drives. Currently the plan is to use a SATA to USB converter connect to internal HDDs (e.g. WD Greens) - but I have yet to find one that doesn't include the power connector or splits the power off into a separate USB cable (more on power later). This would be all be backed up by a Gigabit switch of some description (so the Rock Pi S is not a particularly attractive option, since it would be limited to 100MiB).

I've been using HackerBoards.com to discover different boards which may fit my project - but I'm not particularly satisfied with any of the options here so far. Specifically, I'm after Gigabit Ethernet and USB 3 on the same board if possible.

The next issue is software support. I've been bitten by this before, so I'm being extra cautious this time. I'm after a board that provides good software support, so that I can actually use all the hardware I've paid for.

The other thing relating to software that I'd like if possible is the ability to use a systemd-free operating system. Just like before, when I selected Manjaro OpenRC (which is now called Artix Linux), since I already have a number of systems using systemd I would like to balance it out a bit with some systems that use something else. I don't really mind if this is OpenRC, S6, or RunIt - just that it's something different to broaden my skill set.

Unfortunately, it's been a challenge to locate a distribution of Linux that both has broad support for ARM SoCs and does not use systemd. I suspect that I may have to give up on this, but I'm still holding out hope that there's a distribution out there that can do what I want - even if I have to prepare the system image myself (Alpine Linux looks potentially promising, but at the moment it's a huge challenge to figure out whether a chipset supported or not....). Either way, from my research it looks like having mainline Linux kernel support fro my board of choice is critically important to ensure continued support and updates (both feature and security) in the future.

Lastly, I also have power problems. Specifically, how to power the cluster. The big problem is that the Raspberry Pi 4 requires 3A of power max - instead the usual 2.4A in the 3B+ model. Of course, it won't be using this all the time, but it's apparently important that the ceiling of the power supply is 3A to avoid issues. Problem is, most multi-port chargers can barely provide 2A to connected devices - and I have not yet seen one that would provide 3A to 4+ devices and support additional peripherals such as hard drives and other supporting boards as described above.

To this end, I may end up having to build my own power supply from an old ATX supply that you can find in an old desktop PC. These can generally supply plenty of power (though it's always best to check) - but the problem here is that I'd need to do a ton of research to make sure that I wire it up correctly and safely, to avoid issues there too (I'm scared of blowing a fuse or electrocuting someone etc).

This concludes my first blog post on my cluster plans. It may be a while until the next one, as I have lots more research to do before I can continue. Suggestions and tips are welcomed in the comments below.

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Archive

Art by Mythdael