It’s been a little while since I last shared details of what I’m doing, and in part that’s been because I’ve been a very busy little bee.
I have entire categories of stuff to share with you though, and details on each category at that, but that’s a little bit too much information for a single blog post.
Firstly, we’re putting Guam through its paces. Not only is it getting more traffic (in actual production-like environments), it’s getting getting more sessions, more clients, more corner-cases, and overall, more real-world scenarios. More on this later.
Secondly, we’ve been lacking what you would call a factual implementation of what is otherwise known as a “reference architecture”. We’ve been operating such a reference architecture before, but have had to discard it altogether. Based on this previous iteration, I now have another such reference architecture implementation, but not yet so much to the tune of running all of it in full production. More on this later, too.
Thirdly, process matters. It consists of some barriers to cross again and again, but those barriers are in place for a reason. You can only afford to have so many effective “DevOps engineers” (more precisely, ninjas) on any one payroll, and risks need to be mitigated, and operations need to be delegated (to anything not 3rd-level engineering and beyond). I’m expecting to give out fewer privileges to fewer source code management repositories, and be more strict about what gets accepted in to them. More on this, as you have probably already figured out, later.
Fourthly, SPF, DKIM and DMARC are workarounds, and also DANE piggy-backs on the false premises of DNSSEC, and neither of the former strictly require the latter either. There’s so many attack vectors and exploit paths to all of that, I don’t know where to begin. Regretfully, we’ll have to settle for what “the web” thinks is “safe”. More importantly, apparently so does everyone else, including you if you choose to run your own. This is, may be, or will become, quite the biggest pain in the ass in the foreseeable future, and may or may not result in an entirely distinct major at some respectable university allowed to incur life-long crippling debt upon its attendants.
Fifthly, Storage Pods under GlusterFS are not what they may seem to be when you first look at what they might be on their respective tins. Currently, we’re strongly suspecting the architecture of GlusterFS is one root cause, but we’re also strongly suspecting that any device-mapper based caching is rather completely ineffective (for our use-case). Current plans include utilizing GlusterFS tiering as opposed to device-mapper based caching, but foremost on our minds is a full migration (read: upgrade from where we are now) to Ceph. Because, you know, Ceph. More on this, much more in fact, later on — let’s just say that 130 out of 400 spent in I/O wait for an otherwise idle environment is just … not cool. Again, we shall put more on this later.
Sixthly, ensuring one’s systems render the latest and greatest, always, continuously, all the time, in some sort of stable fashion, is a fallacy that requires you to be more than capable to samurai your way out of virtually any situation. Let it be understood that “any situation” is always “one too many situations” at some point, and continuously requiring all of your employees to behave or seppuku is just not going to work in anyone’s favour.
Finally, what we need is more focus and more fun. I will add that in combination, I shall put more focus on more fun. None of us are going anywhere unless we enjoy what we’re doing.