Seems pretty similar to https://github.com/radioactive-labs/chrono_forge which is what I found when I typed in "rails durable execution patterns" into Google. Have you seen this and if so, how do you think it compares?
I'm not trying to take a shot at the OP, but I keep seeing posts labeled "Production-Grade" that still look more like pet systems than cattle. I'm struggling to understand how something like this can be reproduced consistently across environments. How would you package this inside a Git repo? Can it be managed through GitOps? And if we're calling something production-grade, high availability should be a baseline requirement since it's table stakes for modern production applications.
What I'd really love is a middle ground between k8s and Docker Swarm that gives operators and developers what they need while still providing an escape hatch to k8s when required. k8s is immensely powerful but often feels like overkill for teams that just need simple orchestration, predictable deployments, and basic resiliency. On the other hand, Swarm is easy to use but doesn't offer the extensibility, ecosystem, or long-term viability that many organizations now expect. It feels like there's a missing layer in between: something lightweight enough to operate without a dedicated platform team, but structured enough to support best practices such as declarative config, GitOps workflows, and repeatable environments.
As I write this, I'm realizing that part of the issue is the increasing complexity of our services. Every team wants a clean, Unix-like architecture made up of small components that each do one job really well. Philosophically that sounds great, but in practice it leads to a huge amount of integration work. Each "small tool" comes with its own configuration, lifecycle, upgrade path, and operational concerns. When you stack enough of those together, the end result is a system that is actually more complex than the monoliths we moved away from. A simple deployment quickly becomes a tower of YAML, sidecars, controllers, and operators. So even when we're just trying to run a few services reliably, the cumulative complexity of the ecosystem pushes us toward heavyweight solutions like k8s, even if the problem doesn't truly require it.
I have not used quadlets in a "real" production environment but deploying systemd services is very easy to automate with something like Ansible.
But I don't see this as a replacement for k8s as a platform for generic applications, more for deploying a specific set of containers to a fleet of servers with less overhead and complexity.
I have never denied helm is a mistake that people refuse to stop using. I quite think of Helm as the same as Ansible. Helm is only nice when you consume packages written by others.
Ansible is a procedural mess. It's like helm had a baby with a very bad procedural language. It works, but it's such a mess to work with. Half of the time it breaks because you haven't thought about some if statement that covers a single node or some bs.
Comparing that to docker swarm and/or k8s manifests (I guess even Helm if you're not the one developing charts), Ansible is a complete mess. You're better off managing things with Puppet or Salt, as that gives you an actual declarative mechanism (i.e. desired state like K8s manifests).
> Ansible is a complete mess. You're better off managing things with Puppet or Salt, as that gives you an actual declarative mechanism
We thought this, too, when choosing Salt over Ansible, but that was a complete disaster.
Ansible is definitely designed to operate at a lower abstraction level, but modules that behave like desired state declarations actually work very well. And creating your own modules turned out to be at least an order of magnitude easier than in Salt.
We do use Ansible to manage containers via podman-systemd, but slightly hampered by Ubuntu not shipping with podman 5. It's... fine?
Our mixed Windows, Linux VM and Linux bare metal deployment scenario is likely fairly niche, but Ansible is really the only tenable solution.
> I'm struggling to understand how something like this can be reproduced consistently across environments. How would you package this inside a Git repo? Can it be managed through GitOps?
I manage my podman containers the way the article describes using NixOS. I have a tmpfs root that gets blown away on every reboot. Deploys happen automatically when I push a commit.
There are many ways to do that. Start with a simple repo and spin up a VM instance from the cloud provider of your choice. Then integrate the commands from this article into a cloud-init configuration. Hope you get the idea.
> I'm struggling to understand how something like this can be reproduced consistently across environments. How would you package this inside a Git repo?
Very easily. At the end of the day, quadlets (which are just systemd services) are just text files. You can use something like cloud-init to define all these quadlets and enable them in a single yaml file and do a completely unattended install. I do something similar to cloud-init using Flatcar Linux.
I've been looking at this for gunzip files as well. There is a rust solution that looks interesting called https://docs.rs/indexed_deflate/latest/indexed_deflate/. My goals are to be able to index mysql dump files by tables boundaries.
While not specific to 12factor question. With any of these agents and solutions how is LLM Ops being handled? Also, what's the testing strategy and how do I make sure that I don't cause regression?
i try not to take a hard stance on any tool or framework - the idea is take control of the building blocks, and you can still bring most of the cool LLM ops / LLM observability techniques to bear.
I could see one of the twelve factors being around observability beyond just "whats the context" - that may be a good thing to incorporate for version 1.1
Manage identity centrally is probably referring to using an identity management system like Okta, Microsoft Identity, or hosting your own IdP and using strong hardware 2FA. You don't want people creating their own accounts manually for everything or shared accounts that everyone knows the password for (or is on a shared spreadsheet that the entire company has access to).
At this point most startups would just use Google; since they're almost certainly using Google as their email provider, and "company email" is a de facto root-of-trust even if you don't intend it to be, there isn't really a whole lot of thought that needs to go into it. It helps that they have the best 2FA stack of any mainstream cloud service.
Nice, I'm working in the same space as you (not opensource, personal project). We landed on the same solution, encoding the commands inside Golang and distributing those via SSH.
I'm somewhat surprised not to see this more often. I'm guessing supporting multiple linux versions could get unwieldy, I focused on Ubuntu as my target.
Differences that I see.
* I modeled mine on-top of docker-plugins (these get installed during the bootstrapping process)
Your solution looks much simpler than mine. I started off modeling mine off fly.io CLI, which is much more verbose Go code. I'll likely continue to use mine, but for any future VPS I'll have to give this a try.
hahah seems like we went down the same rabbit hole. I also considered `docker-rollout` but decided to write my own script. Heavily inspired by the docker-rollout source code btw.
Just curious, why did you decide to go with docker plugins?
This is going to be very unhelpful for most but I use nixpkgs and end up applying some build tweaks to make sure all GPU capabilities are properly supported.
That said I know there is newest version 6 which Ubuntu/debian users should be able to add the xpra apt sources to get and it might all just work out of the box. I should check.
Do you have any sources on GitHub moving away from Rails? This is the first that I've heard and my googlefu has returned zero results around this. Just last year they had a blog post around Building GitHub with Ruby and Rails[0] so your remark caught my off guard.