I kill every VPS I touch

I kill every VPS I touch

Somewhere and somehow, there are a handful of sysadmins who have never completely broken a VPS. They might even manage to maintain, update, and optimize their VPSs on a regular basis. They keep them going indefinitely. These people are coveted by industry, make bank, and generally keep everything we love about the internet going.

I am not that sysadmin. There’s a good reason the SSD Nodes engineers don’t let me near any of the important buttons. Or any of the buttons for that matter.

A good sysadmin does not break every VPS they touch. So, a short “survival guide” for terrible sysadmins like me. How can we learn from our mistakes? How can we implement bad sysadmin-friendly tools to halt our bad habits? How can we rid ourselves of this curse?

I’ve used some s****y passwords

You step away for just a moment—maybe you even ask a stranger to hold your spot for you—but when you return, someone has invaded your turf.

Not a great feeling.

Once, I accidentally deployed a new VPS, using an older variant of my standard Ansible playbook, with password as the password. I hadn’t noticed because the password is hashed inside of the playbook. I logged in via password (not SSH key + passphrase), installed Docker, and moved on. The next time I logged in, something felt off. I ran a docker ps -a and found a cryptocurrency miner running on my VPS.

The only natural response was to immediately terminate my connection, log into the administrative area, and reinstall the operating system. In no way am I qualified to mitigate the damage, cut out the intruder, and protect the system from being attacked again. Despite being a yet-unused VPS, I still burned time and once again showcased my sysadmin idiocy.

How can you prevent this?

  1. Use SSH keys and passphrases, instead of just passwords, while also disabling password-based SSH logins.
  2. Pair that passphrase with a manager like Bitwarden to keep you from having to remember it.
  3. Or, choose an SSH passphrase and user passwords that are both complex but easy enough to remember.
  4. Mostly, don’t choose password or anything you’d find on one of those most commonly used passwords lists.

I’ve locked myself out via iptables

You type in a seemingly harmless iptables rule and find yourself unable to do anything else. You kill the session, maybe close down the terminal itself, and try again. No dice. You’ve just locked yourself out—one of the classic beginner sysadmin mistakes.

Given that most firewall setups are done very early in a VPS’ life, you shouldn’t have lost out on too much time at this point. Still, the only real solution is to reinstall the OS and try again.

And if you’re like me, you’ve locked yourself out, reinstalled, and promptly locked yourself out again. Time for some alternatives.

How to stop losing the keys

  1. Use a tool like iptables-apply, which forces you to confirm that the rules work. If you don’t confirm (because you’re locked out), they revert.
  2. Set a “failsafe” on a timer. The at command is great for this. Something simple, like echo 'service iptables stop' | at now + 1min will stop the iptables service after a minute. If you locked yourself out, grab a cup of coffee, log back in, and try again.
  3. Check with your VPS provider if they offer an out-of-band console for lock-out situations. They can be a saving grace in a desperate case like this.

I’ve made critically dumb mistakes multiple times in a row

As in the many iptables missteps.

As in trying to connect PHP to my Nginx web server.

As in blindly trying to perform a major upgrade without thinking about the potential consequences.

As in trying to make SSH more secure, only to accidentally make it secured from myself.

As in trying to install just one more service on top of a dozen others I’ve finally managed to cobble together.


These are the worst ways to kill a VPS. You’ve already put in some real time, had a series of successful installs/deployments, and then hit yourself with that stomach-dropping feeling yet again. Unfortunately, there’s also hundreds of ways to manage this, and only one way out: reinstall.

How I’ve fixed this

  1. Write documentation for your personally complex processes. Installation procedures, quirky configurations, gaps in your expertise—having a written walkthrough, in your own words goes a long way.
  2. Use infrastructure-as-code, like Ansible, to standardize how you work. At least you’ll know the steps you took to get you in your current hole.
  3. Always insert timed failsafes, if possible.
  4. Back up configuration files before you get all touchy-feely.

What works locally breaks remotely

Years ago I was built an awful Node.JS web app for subscribing and listening to podcasts. It’s quite embarrassing to reminisce over, but I honestly felt I was on the leading edge of the podcast revolution. Now, a few years and one PocketCasts acquisition later, I wish I had stuck with it.

But that’s a whole different regret.

I had the perfect setup on my local machine: the precise working versions of ExpressJS and other dependencies, npm with the correct permissions, a MongoDB database without any unnecessary cruft.

The time finally came to spin up a VPS and deploy the web app. I installed all the dependencies, crossed my fingers, and typed in node app.js &. I was met with enough error verbiage to last me three Page Up hits. It felt a little like this:

Final patch to production image

Cobbling together the environment was so complicated that I even added the following note to the Github repository:

The challenge is figuring out how to get it running, because I don’t particularly feel like writing up the installation step-by-step. Have fun hacking!

In the end, with a lot of headaches, I got the web app running and acquired myself roughly a half-dozen users. One of them was my sister, if that says anything about how successful the whole venture was.

Ways to sync up local and remote

  1. Use Docker or something similar, like LXC or even Kubernetes. These tools will help you launch consistent environments both near and far.
  2. Rely more on static-file deployments, like Hugo over WordPress. Reduce your reliance on dependencies and build tools like gulp.
  3. Use Ansible, as I suggested before.
  4. Use some CI/CD system. I’m still not advanced enough for these though, so take that with a grain of salt.

And now I put way too much faith in Docker

Docker has eased my development and deployment processes so much I also rely on it for everything, such as self-hosting other open source web apps on my VPS.

I’ve forgotten how to deploy a LEMP stack on my own. I don’t understand the process of running multiple services on a single VPS any other way. Configuring an Nginx reverse proxy on my own? No way.

By easing certain inadequacies of ours, we reveal others.

How to break the habit

I have no idea. Honestly.

In the end, I assume the worst

The gold star sysadmins might already do all this, but for people like me, there’s still a lot to learn. In reality, there are still quite a few VPSs that will meet untimely but accidental ends.

As long as we bad sysadmins continue our VPS-breaking ways with a consistent desire to learn from our mistakes, we’ll continue to make progress. Failure might be our only way to progress.

It’s not magic. It’s just a blinking cursor on a distant server. And, despite what you might think, your VPS doesn’t mind if you have to start from scratch. Again.