"Mainframe" usually meaning you have legacy code that requires something that only runs in that environment. Like the TPF OS that Visa, Sabre, and many airlines use. Or z/OS things like CICS, Adabas, IDMS, IMS, VSAM, JCL, heavily mainframe flavored COBOL, 360 Assembler, etc.
That is, the only driving requirement left for IBM mainframes is your own software that depends on system software that only exists on IBM mainframes.
Same reason some people still use other old environments like HP MPE, IBM AS/400, and Tandem Nonstop, and so on. They have found, thus far, that the cost of doing that is less than the cost of rewriting it. Or they have been unable to do that for reasons other than cost.
Edit: Separately, there are still some technical advantages. z/OS mainframes have a level of "within the rack" redundancy that you won't find in commodity servers. Or, with TPF, it's difficult to engineer a solution that scales as well and remains reliable...it's a very battle tested distributed K/V store that deals with heavy write contention.
Really hard to replicate Tandem Nonstop's feature set if that was what you actually needed.
Of course part of the reason it faded away (indeed, was gradually fading away even when I worked for Tandem) is that almost nobody does need that sort of reslience.
360 Assembler was the second (and last) assembly I learned (after 6502¹). Gotta love an architecture that requires you to maintain your own call stack for subroutines. Or that has CPU-level instructions to move integers to and from EBCDIC strings.
---
1. Well, I guess second-and-a-half. I played a little with Z-80 on my Spectravideo computer, but mostly that was writing a disassembler in MSX basic to reverse-engineer how the system worked after it was abandoned by its manufacturer. The disassembler was never finished because I didn't properly manage the multibyte opcodes.
IBM has amazing support software for mainframes. Their fix tool, for example (SMTP/E) makes individual tapes for each customer to ensure that fixes go on correctly. All the dependencies are resolved for that individual customer by IBM before the tape ships. That's one-on-one service that makes sure your machine doesn't go down from a patch.
In a high availability environment (like banking or airline reservations, as mentioned), mainframes never go off, even during upgrades. When physical machines need replacement, the entire system is run in parallel on a second machine.
These boxes have both incredible I/O hardware (there's never been anything like IBM channel I/O) and the software to keep everything humming. On a typical day when I worked there, we'd have a 1,000 devs using the same box (and that was a small installation) with multiple operating systems, virtualizations, etc. with zero hiccups.
IBM also has amazing software for getting stuff done. Documentation, for example, is universal and available worldwide from any terminal. When you want a printed manual, the system figures out which big, fast printer is near you and offers to print it down the hall. Every IBM employee has access to all the company's resources from every terminal, and it all just works.
Yes, PCs have caught up to emulating many of these features, but they definitely don't have the robustness or ease of use that System/370 (z/OS) did.
The redundancy part of mainframes is hardware. You can live swap a CPU, for example.
That's an aside though. The reason people keep buying them is software lock in, yes. There's a reason Amazon likes to push their proprietary services...they are the new mainframe.
I read that all mainframe components, even CPUs, were redundant and hot-swappable and that instructions are executed on two separate CPUs to detect faults and correct them on the fly. That would make a lot of sense if your application requires high availability and assurance but isn't designed for it. I haven't heard of any standard server hardware that can give you HA or such assurance with a single machine, probably because you would not build any application dependent on that these days. It's probably cheaper to do in software.
I remember hearing a story (probably apocryphal) about a mainframe that was so redundant that it had to be physically dismantled and moved from one datacentre to another across town, and did so whilst remaining up the entire time.
Case in point, vSphere Fault Tolerance is a software approach that runs your workload in a VM where the CPU instructions are mirrored to another VM on another physical server to deliver redundancy against physical server loss.
I wonder how many CPU cycles we burn running the many layers of abstraction we have built in the distributed computing world.. does this performance penalty exist in the mainframe world
A lot of it is running on dedicated hardware - some used to be dedicated silicon, but, IIRC, a lot of functionality is being brought into the CP (the "normal" CPUs), that when tasked with support functions, get custom microcode for that. Not even booting up these beasts is a simple thing.
This architecture; CPs, ZAAPs, IFLs, IOs etc are compelling to me along with the minimal OS, DB layers with a development setup that feels only just capable, albeit at a cost to the development effort. It feels efficient from a compute perspective? Today in distributed land we have libraries built on libraries built on OS services and API layers with other stuff we don’t need all to do some basic date math for example, how many cycles did we use to do that, maybe it was easier during the one off development process, was it worth it.
mainframes have hardware designed for very high IO speeds, they have no real advantage in cpu speed, but they can keep it fed with data far better than x86 servers which is what you want if you are reading millions of records off disk doing some fairly trivial calculations like adding interest and writing them back.
These days if you can parallelise those calculations the price benefits of servers are worth the software complexity.
A Z15 core runs at 5.2 GHz (IIRC) and has shared access to 960 MB of L4 cache for every group of 4 sockets of 10 cores each (Linux workloads can do SMT2 on each core). They emphasize single-thread speed because they measured diminishing returns when adding more cores to an LPAR and figured out it was pointless to play a numbers game.
> These days if you can parallelise those calculations
Yes, but it's not all workloads that are amenable to that - some will want to keep a consistent in-memory representation of the working data with all cores working on the same data. If you can scale out, great. If you can't, this is the very top of the line. It you need to scale up from a z15, I suggest you wait for the z16 availability ;-)