IPA makes these conversations less ambiguous. The point is that parts of the South are more likely to use an "ah" sound rather than an "oh" sound in certain places. The BBC's example (supposing it's in good faith) is lacking because it drops the second half of the dipthong following that morphed vowel.
Attempting to write out something close to what I'm imagining they're trying to get across in plain English:
hell-ah-ooh
It's obviously not universal across the South, but you'll rarely see it outside of the South, so "might suggest you're from..." is probably accurate.
This is good. It covers the two easiest dominant methods people use. It even touches on my main complaint for the one they seem to recommend.
That said:
- Constrained generation yields a different distribution from what a raw LLM would provide. This can be pathologically bad. My go-to example is LLMs having a preference for including ellipses in long, structured objects. Constrained generation forces closing quotes or whatever it takes to recover from that error according to a schema, nevertheless yielding an invalid result. Resampling tends to repeat till the LLM fully generates the data in question, always yielding a valid result which also adheres to the schema. It can get much worse than that.
- The unconstrained "method" has a few possible implementations. Increasing context length by complaining about schema errors is almost always worse from an end quality perspective than just retrying till the schema passes. Effective context windows are precious, and current models bias heavily toward earlier data which has been fed into them. In a low-error regime you might get away with a "try it again" response in a single chat, but in a high-error regime you'll get better results at a lower cost by literally re-sending the same prompt till the model doesn't cause errors.
> Increasing context length by complaining about schema errors is almost always worse from an end quality perspective than just retrying till the schema passes.
Another way to do this is to use a hybrid approach. You perform unconstrained generation first, and then constrained generation on the failures.
There's no difference in the output distribution between always doing constrained generation and only doing it on the failures though. What's the advantage?
4) It's easy to get too excited about the tech and ignore its failure modes when describing your experiences later
I use AI a lot. With your own control plane (as opposed to a generic Claude Code or whatever) you can fully automate a lot more things. It's still fundamentally incapable of doing tons of tasks though at any acceptable quality level, and I strongly suspect all of (2,3,4) are guiding the disconnect you're seeing.
Take the two things I've been working on this morning as an example.
One was a one-off query. I told it the databases it should consider, a few relevant files, roughly how that part of the business works, and asked it to come back when it finished. When it was done I had it patch up the output format. It two-shot (with a lot of helpful context) something that would have taken me an hour or more.
Another is more R&D-heavy. It pointed me to a new subroutine I needed (it couldn't implement it correctly though) and is otherwise largely useless. It's actively harmful to have it try to do any of the work.
It's possible that (1) matters more than you suspect too. AI has certain coding patterns it likes to use a lot which won't work in my codebase. Moreover, it can't one-shot the things I want. It can, however, follow a generic step-by-step guide for generating those better ideas, translating worse ideas into things that will be close enough to what I need, identifying where it messed up, and refactoring into something suitable, especially if you take care to keep context usage low and whatnot. A lot of people seem to be able to get away with CLAUDE.md or whatever, but I like having more granular control of what the thing is going to be doing.
This seems easy enough to solve. Every time the football oligarchy catches too many IPs in their dragnet, you can accidentally drag all the legitimate football exit nodes into your DNS blacklist. The only way to be sure DNS doesn't work for pirated football is to ensure it doesn't work for any football.
Just last weekend I developed a faster reed-solomon encoder. I'm looking forward to my jail time when somebody uses it to cheaply and reliably persist bootlegged Disney assets, just because I had the gall to optimize some GF256 math.
That is not what I said. It is about signalling risks to developers, not criminalising them. And in terms of encoders, I would say it relates more to digital 'form' than 'content' anyways, the container of a creative work vs the 'creative' (created) work itself.
While both can be misused, to me the latter category seems to afford a far larger set of tertiary/unintended uses.
To be fair, those macronutrient guidelines were established not because of any special properties of those macros (give or take the nitrogen load from protein) but because when applied as a population-level intervention they encourage sufficient fiber, magnesium, potassium, etc. You can have 50% of your calories be from fats and still live a long, healthy life, and you can do so as a population (see, e.g. Crete and some other Mediterranean sub-regions in the early/mid-1900s). You can have a much higher protein intake and have beneficial outcomes too.
Your point about the sources mattering isn't tangential; it's the entire point. The reason the AMDR exists is to encourage good sources. A diet of 65% white sugar and 25% butter isn't exactly what it had in mind though, and it's those sources you want to scrutinize more heavily.
Even for red meat though, when you control for cohort effects, income, and whatnot, and examine just plain red meat without added nitrites or anything, the effect size and study power diminishes to almost nothing. It's probably real, but it's not something I'm especially concerned about (I still don't eat much red meat, but that's for unrelated reasons).
To put the issue to scale, if you take the 18% increased risk in colorectal cancer from red meats as gospel (ignoring my assertions that it's more important to avoid hot dogs than lean steaks), or, hell, let's double that to 36%, your increased risk of death from the intervention of adding a significant portion of red meat to your diet is only half as impactful as the intervention of adding driving to your daily activities.
The new guidelines seem to be better than just recommending more steaks anyway. They're not perfect, but I've seen worse health advice.
Well, there are two factors that go into the recommendations. As you mentioned, one is adequate micronutrients. The other is chronic disease risk reduction. The 5-10% of total calories from saturated fat recommendation falls into the latter category. The risk of meat and dairy is not just cancer, but saturated fat.
I would agree that with proper knowledge and planning, it's possible to reduce carbs and increase protein/unsaturated fats while maintaining adequate fiber and micronutrients. But in practice, I think it's much more common to see people taking low-carb diet recommendations as a license to eat a pound or more of meat per day, drink gallons of milk per week, and completely ignore fiber intake, which is objectively not healthy.
On the one hand, that obviously filters out many qualified candidates.
On the other, you only have so much time in the day. It'd take me 3-6 months to give phone screens to every resume that comes in the door for any one engineering role, 8x that for a full 4-hour interview. I have to filter through them somehow if it's my job to hire several people in a month.
You'll obviously start with things that are less controversial: Half of resumes are bot-spam in obvious ways [0]. Half of the remainder can easily be tossed in the circular filing bin by not having anything at all in their resume even remotely related to the core job functions [1].
You're still left with a lot of resumes, more than you're able to phone screen. What do you choose to screen on?
- "Good" schools? I personally see far too much variance in performance to want to use this as a filter, not to mention that you'd be competing even more than normal on salary with FAANG.
- Good grades? This is a better indicator IME for early-career roles, but it's still a fairly weak signal, and you also punish people who had to take time off as a caretaker or who started before they were mature enough or whatever.
- Highest degree attained? I don't know what selection bias causes this since I know a ton of extremely capable PhDs, but if anything I'd just use this to filter out PhDs at the resume screening stage given how many perform poorly in the interviews and then at work if we choose to hire them.
- Gender? Age? ... I know this happens, but please stop.
If there's a strong GitHub profile or something then you can easily pass a person forward to a screen, but it's not fair to just toss the rest of the resumes. They have a list of jobs, skills, and accomplishments, and it's your job to use those as best as possible to figure out if they're likely to come out on top after a round of interviews.
I don't have any comment on rails in particular, but for a low-level ML role there are absolutely skills I don't want to see emphasized too heavily -- not because they're bad, but because there exists some large class of people who have learned those skills and nothing else, and they dominate the candidate pool. I used to give those resumes a chance, and I can't accept 100:1 odds anymore on the phone screen turning into a full interview and hopefully an offer. It's not fair to the candidates, and I don't have time for it either.
And that's ... bad, right? I have some things I do to make it better in some ways (worse in others, but on average trying to save people time and not reject too many qualified candidates) -- pass resumes on to a (brief) written screen instead of outright rejecting them if I think they might have a chance, always give people a phone screen if they write back that I've made a mistake, revisit those filtering rules I've built up from time to time and offer phone screens anwyay, etc -- hiring still sucks on both sides of the fence though.
[0] One of my favorites is when their "experience" includes things like how they've apparently done some hyper-specific task they copy-pasted from the job description (which exists not as a skills requirement but as a description of what their future day-to-day looks like), they did it before we pioneered whatever the tech in question was, they did it at several FAANG companies, and using languages and tools those companies don't use and which didn't exist during their FAANG tenure. Maybe they just used an LLM incorrectly to touch up their resume, but when the only evidence I should interview you is a pack of bold-faced lies I'm not going to give the benefit of the doubt.
[1] And I'm not even talking about requiring specific languages or frameworks, or even having interacted with a database for a database-adjacent role. Those sorts of restrictions can often be too overbearing. Just the basics of "I need you to do complicated math and program some things that won't wake me up at night" and resumes that come in without anything suggesting they've ever done either at any level of proficiency (or even a forward or a cover letter stating why their resume appears bare-bones and they deserve a shot anyway).
IIRC it was released somewhere in the iPhone 4/5 transition (2011ish?). It was so abysmal for a road trip I took that I went to Android and haven't looked back (they also removed Google Maps for a bit, and the web version wasn't suitable). It wouldn't have been top of mind for me in 2016, but I wouldn't have been surprised at somebody telling me Apple maps sucked.
Attempting to write out something close to what I'm imagining they're trying to get across in plain English:
hell-ah-ooh
It's obviously not universal across the South, but you'll rarely see it outside of the South, so "might suggest you're from..." is probably accurate.
reply