> In this case though, I'll admit that it would be a negative signal if they never tried it even once and refuse to do so
I was arguing that this is a bad question elsewhere but you provided another reason. If a candidate tells you they haven't tried using AI (without saying why), that offers no signal at how well they'll do if you hire them, and you construe it as a negative signal. If you want to know if they would be willing to use AI as part of the job ask that question instead!
> You can't make a solid opinion on things you never try after all.
> The most impressive think that stuck with me is that humans are incredibly efficient, from an energy perspective, in anything we do, compared to machines.
Humans are efficient, but not across the board. Trivial counterexample: walking is incredibly energy inefficient vs a bicycle or other wheeled conveyances whose primary dissipater is rolling resistance.
We're still pretty efficient while not having wheel shaped limbs. Running like humans works pretty well. So well even that we can chase a lot of animals longer than they can outrun us.
There might be more efficient ways to move but we are pretty well equipped by evolution.
It's not strange at all, I was responding to a specific, incorrect claim. I even quoted the wrong claim in my earlier comment , and I'll repeat it again, with added emphasis
>>> humans are incredibly efficient, from an energy perspective, in anything we do, compared to machines
I simply provided contrary evidence to a well-defined, falsifiable claim. How is that strange?
Yes, but walking and moving on wheels is oranges and apples. It would be a relevant comparison if a robot with a movement mechanism based on two feet was more efficient than a human.
> On the other hand, even if Apple's AI were 6 - 9 months or a generation behind,
Do you mean Google's AI with Apple wrappers? Apple's in-house AI is further behind Google, amd very far from the frontier according to your ranking. IMO, Google is on the frontier - I recall Altman calling for an OpenAI all-hands-on deck when Gemini was released because of how good it was compared to ChatGPT. I also suspect Google has the lowest operating expenses due to scale, experience and luck/planning (TPUs), there will come a time when AI investments will slow down, and the cost of revenue will become more important.
> ...the research progresses before the inevitable nationalization of the frontier.
Hacker News has been telling me America beats China at "innovation" because of the "freedoms" - especially frew enterprise. I wonder how a nationalized frontier lab would perform.... Andhow the non-citizen researchers would feel about working for the US government that doesn't trust them to use frontier models.
Model effectiveness has improved across model sizes. You really should try the latest flash variants more. They have become my default for most tasks except for gnarly high-level planning.
Right - the idea that "bigger model = better" might have been true a year ago, but the flash models are extremely effective right now. You simply use them for the tasks they are ideally suited for.
"Capability per parameter" is rising, but parameter count remains an advantage. And small models remain bad, because "good" is a rapidly moving target.
A 2026 4B beats 2024 4B, but both are far behind the contemporary frontier. Which makes them bad. There is no such thing as "too much capability" - a "good" model is whatever the current frontier is.
In 2024, a "good" model is one that can be trusted to write a 800 line script. In 2026, it's a model that can be trusted to do gnarly high-level planning and execution both. In 2028, it's going to be something like a model you can point at an extremely involved task, abandon, and have it report back with a "done" in 3 weeks.
> A 2026 4B beats 2024 4B, but both are far behind the contemporary frontier.
The thing about engineering is you don't just use the biggest bolt on the market on every bridge.
> In 2024, a "good" model is one that can be trusted to write a 800 line script. In 2026, it's a model that can be trusted to do gnarly high-level planning and execution both
This sounds a lot like having a single diamond-head hammer as the only tool in your toolbox. As suggested by the name, flash models are fast - sometimes I want to write the equivalent of fifty 800-line scripts. There is such a thing as good enough.
Good enough? That's a lie people tell each other because they lack imagination.
"It's good enough" was said about GPT-4, o1, o3, Opus 4 and more. Guess what happened? Newer models released, people updated their expectations of what LLMs can do, usage got more aggressive, and somehow, GPT-4 went from "good enough" to "obsolete trash".
If you have no imagination, then at least substitute your pattern recognition for it.
The world is hungry for capabilities. There are piles upon piles of tasks that aren't done by LLMs simply because LLMs aren't good enough to do them.
The thing a frontier model gives you is "you don't have to babysit a model to get it to do X", and that X gets more and more impressive release to release.
I wish you had addressed at least one of arguments in good faith before jumping to insults and countering a strawman argument I didn't make - I never claimed their will be no use for more capable models.
You do your AI-maximalism, and I'll stick to making trade-offs based on the needs of each piece of work.
You call the response "cagey and evasive", but that is for an objectively a bad interview question, one wrung below "How many years experience do you have prompting Anthropic Opus? We are an Opus shop." People are not locked into their current way of using AI and it is trivial to match how one works with AI to match employers requirements. It's a question that deserves an idealized non-answer
Remember the context - this is while solving a whiteboard problem. Its bad in the same way asking candidate what their birthstone is - because any answer offer little or no signal about the odds of a candidates success at the company.
I'm curious to know why you think asking about AI usage is a good interview question.
If you're going to have people use AI regularly, it's worth asking so that you can get a sense of their interest/willingness, experience level, and training needs. That said, more specific questions are typically more revealing. Personally I'm fond of "If you could give Claude only 1 instruction, what would it be?"
> These six-figure reports are produced by underpaid kids in their twenties working 18 hours a day.
That's accurate, for the first draft. Similar to big legal firms - subsequent versions are signed-off and passed up (and if revisions request, down) the hierarchy, each stratum with its own billing rate(s).
Which makes me wonder when the hallucinations got added.
It can't have been at any of the big 4, because partners aren't skipping 4+ org-chart layers to look at draft documents written by early-career associates. I have no experience with body shops - if that's where you were.
> So, are you saying that the employees were exploited in some way?
Google ads "extracted" value from traditional advertising in newspapers and magazines, so the "exploitation" (or efficiency gains, if you're charitable) came at the expense of employees at other organizations worldwide.
I was arguing that this is a bad question elsewhere but you provided another reason. If a candidate tells you they haven't tried using AI (without saying why), that offers no signal at how well they'll do if you hire them, and you construe it as a negative signal. If you want to know if they would be willing to use AI as part of the job ask that question instead!
> You can't make a solid opinion on things you never try after all.
What do you think about using heroin?
reply