I met regexes when I was 13, I think. I spent a little time reading the Java API docs on the language's regex implementation and played with a couple of regex testing websites during an introductory programming class at that age. I've used them for the rest of my life without any difficulty. Strict (formal) regexes are extremely simple, and even when using crazy implementations that allow all kinds of backreferences and conditionals, 99.999% of regexes in the wild are extremely simple as well. And that's true in the example from TFA! There's nothing tricky or cryptic about this regex.
That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.
> That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.
Agree. I would understand if there was some obvious advantage here, but it doesn’t really seem like there is a dimension here where regex has an advantage over a list. It’s (1) harder to implement, (2) harder to review, (3) much harder to test comprehensively, (4) harder for users to use (correctly/safely).
This is too hot a take. Regular expressions are used in some cases where they shouldn’t be, yes, but there’s also been a ton of code which used other string operations but had bugs due to the complexity or edge-cases which would have been easier to avoid with a regex. You should know both tools and when they’re appropriate.
From an educational perspective, regular expressions are also a great way to teach about state machines, computational complexity, formal languages, and grammars in a way that has direct applications to tools that are long-lived and ubiquitous in industry.
It's also this context that reveals how much simpler strict regular expressions are than general purpose programming languages like Python or JavaScript. That simplicity is also part of what makes regexes so ubiquitous: due to its lower computational complexity, regex parsing is really fast and doesn't take much memory.
When I say regexes are simple, I'm not really talking about compactness. I mean low complexity in a computational sense! As someone who rather likes regex, I think it would be totally fair for a team to rule out all uses of PCRE2 that go beyond the scope of regular languages. Those uses of regex may be compact, but they're no longer simple.
I'm also someone who is sensitive to readability-centered critiques of terse languages. Awk, sed, and even Bash parameter expansion can efficiently do precise transformations, too. But sometimes they should be avoided in favor of solutions that are more verbose, more explicit, and involve less special syntax. (Note also that Bash, awk, and sed are also all much more complex than regex!)
Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.
What is the claim? That it is compact for simple cases. Well Brainfuck is a compact programming language but I don't see it in production. Why?
Because the whole point of programming is that multiple eyeballs of different competence are looking at the same code. It has to be as legible as possible.
> Regex is not used for parsing HTML or C++ code. So it is not good for complex tasks.
Again, this is too binary a way if thinking. There are string matching operations which are not parsing source code and regular expressions can be a concise choice there. I’ve had cases where someone wrote multiple pages of convoluted logic trying to validate things where the regular expression was not only much easier to read but also correct because while someone was writing the third else-if block they missed a detail.
What is the quality-first, high uptime alternative to GitHub? My employer uses both GitHub and GitLab, and while I think GitLab is better, its quality also frankly sucks. It's riddled with bugs that have just been marinating on the issue tracker for years, and the most common "fix" for gnarly bugs in the CI platform is "revise the documentation to reflect the existing (broken) behavior".
*Stupid question*: What is so hard about self hosting one's own repo? I get it must be difficult for a mega corporation, but for companies like us, who have hundreds of repos but only 20 of them are regularly used, and concurrent read/write is relatively light -- considering our largest team is less than 20 persons, so even if all of them are reading/writing from the repo, it doesn't seem to be a huge issue.
Even for a bigger company, say 5x developers (we have about 100+ SWEs and maybe 10-20 other titles who use GitHub), is it really a big thing to self host their own repos? External applications are definitely on another level because you could have hundreds of concurrent visits easily.
> What is so hard about self hosting one's own repo?
Maybe nothing! I was genuinely asking. I still don't know what Actually Good™ forges are out there these days, generally suitable for corporate use in place of the likes of GitHub or GitLab. Forgejo? Something not based on Git?
I guess self-hosting GitHub is the easiest second step for companies that use GitHub? It does have a lot of niceties built around git, which is very crude.
It's amazing, before we even had ChatGPT, GitLab was building so much endless slop halfbaked crap in their pursuit of ever more "enterprise checkboxes". Now they have slowed right down, no doubt collapsing under the escalating maintenance weight of all the nonsense that was created, like the canaries in the vibe coding mines telling us of impending doom.
Now you go to their blog, theres a banner at the top announcing "GitLab Agentic AI whatever is GA (GENERAL AVAILABILITY)" and you try to click it its literally a fucking 404 not found. That's the level of their stability and quality. Try it for yourself:
Years ago, this book provided me with a useful introduction to the history of immigration to the United States and various crackdowns (vigilante and official) against it.
It's not a difficult read, but its authors are leftists and the language may sometimes be difficult for readers with sensitivities related to the goodness of Democrats or Republicans or whatever.
(I think maybe I'll re-read it today as well; it's been a long time.)
> filtered out at some stage (an early one being college intro CS classes)
Most schools' CS departments have shifted away from letting introductory CS courses perform this function— they go out of their way to court students who are unmotivated or uninterested in computer science fundamentals. Hiring rates for computer science majors are good, so anything to up those enrollment numbers makes the school look better on average.
That's why intro courses (which were often already paced painfully slowly for anyone with talent or interest, even without any prior experience) are being split into more gradual sequences, Python has gradually replaced Scheme virtually everywhere in schools (access to libs subordinating fundamental understanding even in academia), the relaxation of the major's math requirements, etc.
Undergraduate computer science classrooms are increasingly full of mercenaries who not only don't give a shit about computer science, but lack basic curiosity about computation.
From my dated experience in a CS-adjacent major, I'm torn between "that's bad, people need to care about the craft" versus "that's good, CS was a bit too ivory-tower/theory focused".
As someone who ended up getting two bachelor's degrees so that I could somewhat deeply explore diverse subjects, I think schools would do well to have strong, distinct programs in:
- computer science
- computer engineering
- software engineering
- mathematics
- some kind(s) of interdisciplinary programs that interweave computing with fine arts, liberal arts, or business, e.g.,
- digital humanities
- information science
- idk what other disciplines
and provide generously list courses taught in one department but highly relevant in another under multiple headings, for use as electives in adjacent minors and majors.
IIRC, when I was in school, my university only had programs in "computer science", "electrical and computer engineering", "management information systems", "mathematics", and an experimental interdisciplinary thing they called "information science, technology, and the arts". Since then, they've created a "software engineering" major, which I imagine may have alleviated some of the misalignment I saw in my computer science classes.
I loved the great range of theory classes available to me, and they were my favorite electives. If there had been more (e.g., in programming language design, type theory, or functional programming), I definitely would have taken them. But if we'd had a software engineering program, I likely would have tried to minor in that as well!
To me, it's an old-school liberal art (like geometry and arithmetic) that specialists typically pursue as a formal science (that is, a science of logical structure rather than experimentation, like mathematics or Chomskyan grammar). The engineering elements that I see as vital to computer science per se are not really software engineering in the broadest sense, but mostly about fundamentals of computing that are taught in most computer science programs already (compilers, operating systems, binary operations, basic organization of CPUs, mainframes, etc.).
My computer science program technically had only one course on software engineering per se, and I think schools should really offer more than that. In fact, I think that's not enough even within a "computer science" program. But I think the most beneficial way to provide courses of broader interest is with "clear but porous" boundaries between the various members of this cluster of related disciplines, rather than revising core computer science curricula to court students who aren't really interested in computer science per se.
> Without exception, every technical question I've ever asked an LLM that I know the answer to, has been substantially wrong in some fashion.
The other problem that I tend to hit is a tradeoff between wrongness and slowness. The fastest variants of the SOTA models are so frequently and so severely wrong that I don't find them useful for search. But the bigger, slower ones that spend more time "thinking" take so long to yield their (admittedly better) results that it's often faster for me to just do some web searching myself.
They tend to be more useful the first time I'm approaching a subject, or before I've familiarized myself with the documentation of some API or language or whatever. After I've taken some time to orient myself (even by just following the links they've given me a few times), it becomes faster for me to just search by myself.
There's a superficial relationship because XFCE is often configured with a dock-like taskbar, but GNUstep is a GNU clone of Cocoa and the window manager from NeXTSTEP. It tries to mimic early macOS a bit more deeply.
> I think there is a section of programmer who actually do like the actual typing of letters, numbers and special characters into a computer.
I don't think this is really it for many people (maybe any); after all, you can do all of that when writing a text message rather than a piece of code.
But it inches closer to what I think is the "right answer" for this type of software developer. There are aspects of software development that are very much like other forms of writing (e.g., prose or poetry).
Like other writing, writing code can constitute self-expression in an inherently satisfying way, and it can also offer the satisfaction of finding "the perfect phrase". LLMs more or less eliminate both sources of pleasure, either by eliminating the act of writing itself (that is, choosing and refining the words) or through their bland, generic, tasteless style.
There are other ways that LLMs can disconnect the people using them from what is joyful about writing code, not least of all because LLMs can be used in a lot of different ways. (Using them as search tools or otherwise consulting them rather than having them commit code to simply be either accepted/rejected "solves" the specific problems I just mentioned, for instance.)
There is something magical about speaking motion into existence, which is part of what has made programming feel special to me, ever since I was a kid. In a way, prompting an LLM to generate working code preserves that and I can imagine how, for some, it even seems to magnify the magic. But there is also a sense of essential mastery involved in the wonderful way code brings ideas to life. That mastery involves not just "understanding" things in the cursory way involved in visually scanning someone else's code and thinking "looks good to me", but intimately knowing how the words and abstractions and effects all "line up" and relate to each other (and hopefully also with the project's requirements). That feeling of mastery is itself one of the joys of writing code.
Without that mastery, you also lose one of the second-order joys of writing code that many here have already mentioned in these comments: flow. Delegation means fumbling in a way that working in your own context just doesn't. :-\
Bringing random hardware from vendors who never intended to support an OS is a weird criterion to judge an OS' "readiness" by— and one no one seems to apply to macOS or Windows.
That said, what this regex wanted to be was obviously just a list. AWS should offer simpler abstractions (like lists) where they make sense.
reply