I may disappoint you with the fact that IBM PC-compatible computers have replaced devices of that class. We can only observe certain terminal emulators in some operating systems. There have been many attempts to expand the functionality of these emulators. However, most features beyond the capabilities of VT100 have not caught on (except UTF-8 support). I do not believe that anything will change in the foreseeable future.
Hashing is so fast that you can hand-wave it away as zero cost relative to the time taken to read such a large amount of data. Also, you only have to do it once for the whole input, which means that it's O(n) time where 'n' is the gigabytes of passwords you have.
Sorting is going to need about O(n * log n) time even if it's entirely in memory, but more if it has to spool to disk storage then it'll take much longer than the hashing step.
PS: I just realised that 2 billion passwords is not actually that much data -- only 40 GB of hashes -- that's well within the range of what's "easy" to sort in-memory by simply creating an array of hashes that size and calling a standard library sort function.
What other algorithms have you used?
I'm really interested in big data streams.
I would like to hear not only successful solutions, but also failed ones.
Have you tried using Bloom filters?
Is it possible to merge shards using the Min-Heap algorithm?
Algorithm choice depends on what you're optimising for. The discussion a few years ago was dozens of small web servers handling a large volume of password change traffic (10K/sec!) needing a cheap centralised service for verifying against "known bad" passwords. On a cloud hosting platform, the optimal solution is a blob store of sorted binary hashes with a small (~1 MB) precomputed index stored in-memory in the web server that lets you query the main store with 1 or at most 2 reads. This is an optimal "one round trip" request, and the per-server overhead at the front end is negligible.
However, that approach assumes a slowly changing list where you don't mind that there's a delay of maybe a few hours to merge in a new list. Large lists of password leaks are infrequent, so this works fine for this usecase.
A two-layer approach with a small frequently updated hash set on top of the large infrequently built sorted list is more generally applicable to a wider range of problems.
Bloom filters are probabilistic, and aren't much faster to create than a sorted list. They also require 'k' random reads to test, where k is a small constant such as 3 to 5. If the filter is large (gigabytes), then you can't efficiently load it into many front-end web servers. If you query it over the network as a blob, you need 3-5x the I/O operations. You can wrap it in an API service that holds it in-memory, but that's still much more complex than simply storing and querying a blob in S3 or Azure Storage directly.
"Clever" algorithms like min-heaps or whatever are likely not worth the trouble. My decade-old PC can sort 2 billion hashes in about 30 seconds using the Rust "Rayon" crate. There are cloud VMs available for about $2/hr that have about a terabyte of RAM that could sort any reasonable sized list (10s of billions) in a few minutes at most.
The original article mention a week of 80 vCores of SQL Hyperscale, which is about $6,000 at PayG rates!
Sure, developer time is expensive, blah blah blah, but waiting a week ain't cheap either, and these days an AI can bang out the code for a password file hashing and sorting utility in a minute or two.
Explain to me what tasks can be solved in the approach with flows? I used Node-Red in some automation processes. I managed to solve only simple problems that are easy to solve without AI. But how do you program in this style? What does the flow style look like, for example, a program to find the shortest solution of a sokoban or game 15?
PS. Usually, when I need to make all photos in the folder black and white, I use imagemagick.
Great question — and I completely agree with your experience. Most flow-based tools like Node-RED or n8n are great for simple, event-driven tasks — but once you try to model anything non-trivial, like search algorithms or puzzle solving, they tend to fall short.
OOMOL takes a different approach.
Each node is a fully programmable unit — you write full Python or Node.js code, import pip/npm libraries, manage state, and do whatever logic you want. The flow just connects these pieces in a modular way.
So when it comes to problems like solving Sokoban or the 15 puzzle, it’s not about drawing a visual BFS graph — instead, you might structure it like:
- One node to define the state structure
- One node to generate next states
- One node to manage a queue (e.g. in memory or Redis)
- One node to evaluate whether the goal is reached
- And inside the node code: your BFS or A* logic
In this sense, OOMOL doesn’t force you to express logic visually — the *code is the logic*, and the *flow is how you organize and compose that logic across tasks*. Think of it as “wiring together programmable building blocks” rather than trying to drag-and-drop logic trees.
That said, you raise an important point: *for many simple tasks — like batch processing images using ImageMagick — scripts or shell commands are 100% the right tool.* OOMOL isn't trying to replace that.
Where OOMOL helps is when:
- You have multiple steps to coordinate (e.g. conditionally processing based on metadata)
- You need to integrate with APIs, databases, or cloud storage
- You want to track, retry, or debug failed tasks
- You’re composing reusable automation pipelines that grow in complexity over time
In short:
> *Not every task needs a workflow engine — but once you start composing real logic across systems, code + flow gives you both power and structure.*
We’re still early and figuring out how to best serve developers like you — really appreciate you pushing on this distinction. Would love to hear what types of things you’ve tried to automate, and where tools like Node-RED started to fall short.
Yes, and it's infuriating if software draws the conclusion that, simply because I happen to be in region x (or even worse, using language x), I must also want date presentation, units of temperature and distance etc. to be according to customs in x.
At least macOS/iOS at this point mostly allow customizing many of these, but some native apps and date picker widgets still don't respect my preferences, driving me nuts every time I have to schedule a reminder or meeting.
WebSocket solves a very different problem. It may be only partially related to organizing two-way communication, but it has nothing to do with data complexity. Moreover, WS are not good enough at transmitting binary data.
If you are using SSE and SW and you need to transfer some binary data from client to server or from server to client, the easiest solution is to use the Fetch API.
`fetch()` handles binary data perfectly well without transformations or additional protocols.
If the data in SW is large enough to require displaying the progress of the data transfer to the server, you will probably be more suited to `XMLHttpRequest`.
Espressif implies that the WiFi hardware uses ADC2 for something. It sounds like a hardware limitation, a firmware issue would have been patched a long time ago.
https://commons.wikimedia.org/wiki/File:DEC_VT100_terminal.j...
I may disappoint you with the fact that IBM PC-compatible computers have replaced devices of that class. We can only observe certain terminal emulators in some operating systems. There have been many attempts to expand the functionality of these emulators. However, most features beyond the capabilities of VT100 have not caught on (except UTF-8 support). I do not believe that anything will change in the foreseeable future.