Most tools, frameworks and articles in IT, SaaS in particular, are about spinning up things. It is what people find exciting.
Work a few years in Ops and you learn that spinning up things is not a big part of your work. It's maintenance, such as deleting stuff.
Unfortunately this process is the hardest, and there's very little to help you do it right. Many tools, framework and vendors don't even have proper support for it.
Some even recommend 'rinse and repeat' instead of adjusting what you have - and this method is not great if you value uptime, nor if you have state that you want to preserve, such as customer data :-)
Deleting stuff, shutting services down, turning off servers - those are hard tasks in IT.
My acid test for provisioning automation products is asking: Can it rename deployed resources?
Practically none can, even in market segments where this is highly relevant. For example: user identity and access management products. Women get married and change their name all the time!
The next level up is the ability to rename a container such as an organisational unit or a security group.
Then, products that can rearrange a hierarchy to accommodate a merger, split, or a new layer of management. This obviously needs to preserve the data. “Immutable infrastructure” where everything is recreated from scratch and the original is dropped is cheating.
I’ve only ever seen one such provisioning tool, the rest don’t even begin to approach this level of capability.
Everyone should consider Victoria Metrics anyway. It scales better performance wise, and they broke out components to improve scalability (vmagent, vmalert, etc) when Prometheus was just one huge process that did all the things. The two work closely together, and even did a good talk together about the differences.
I love Prometheus because the (OpenMetrics) data protocol is so darn simple and easy to grok. You can do things like take an arbitrary data source, pipe it through awk and curl, and get it into prometheus metrics via remotewrite. You can also easily write your own /metrics endpoint in your favorite language.
VictoriaMetrics sweetens the deal by offering a solution to long-term storage and more flexible service architecture without leaving the simple and highly interopable Prometheus ecosystem.
Not only that, we were able to reduce the total virtual machine ram where our monitoring was hosted by half and storage is more efficient too I think when we switched to victoriametrics.
When I'm explaining KV to a junior dev, I do it something like this:
Memcached got big during a confluence of two problems that have both been solved now. First, we had a bunch of big, semi-stateful web applications built in programming languages with GC, and server memory got bigger than you could reliably collect without long pauses. Pushing some of your data into a separate memory space raises the ceiling considerably.
Secondly, during this ten year window, ethernet cards were half an order of magnitude faster than hard drives. Putting stuff on another machine could be faster than sending it to swap, a memory mapped file, or some more sophisticated data store (like a database).
We don't have to struggle with these now, and half the time we avoid the first one altogether. They still have lots of places they are used, but you are way better off working to cache inbound requests instead of outbound requests. That lets you move a bunch of caching to the edge of your network, or to the user agent.
Work a few years in Ops and you learn that spinning up things is not a big part of your work. It's maintenance, such as deleting stuff.
Unfortunately this process is the hardest, and there's very little to help you do it right. Many tools, framework and vendors don't even have proper support for it.
Some even recommend 'rinse and repeat' instead of adjusting what you have - and this method is not great if you value uptime, nor if you have state that you want to preserve, such as customer data :-)
Deleting stuff, shutting services down, turning off servers - those are hard tasks in IT.
reply