Awesome to see this. Like a few others here, I hand-rolled (well, Codex-rolled) something similar that works great for me. I keep going back and forth on open-sourcing it, but my hunch is people won't really adopt these kinds of things anyway.
Everyone ends up with a workflow shaped really tightly around how they work, and it's gotten so cheap to just build and evolve your own as the models and harnesses change that picking up someone else's stops making much sense.
I think we can consider this among the positive consequences of LLMs. Building software is cheaper, you don’t have anymore to adapt your company processes to the tools available in the market. If you don’t find what you’re looking for, you can build it and actually see if there’s a market interest for it.
Aren't we losing something there too though. I always respected a company with a product that had "things figured out" and pushed their product in conjunction with a way of working that was well researched and proven to be optimized.
I'm not convinced companies always need software tailored to their workflows, and could benefit from adopting worn-path workflows instead.
> I'm not convinced companies always need software tailored to their workflows, and could benefit from adopting worn-path workflows instead.
I’m dubious, because for an established company the question is whether the software adapts to the org, or if the org adapts to the software. It’s a lot harder to change the workflow of a whole company than to buy software that enables your current workflow. There’s months of retraining and figuring out where compliance goes in the new workflow, and things that get done wrong along the way because it’s new, and etc.
You need a pretty big efficiency win to offset the dead weight of time spent just changing workflows.
That makes sense when things are mostly stable and it makes little sense for most teams to work outside the norm.
Currently though we are in a world where things change every week, model capabilities, harnesses, pricing etc. Forcing a norm wont work, because there is no such norm.
I am fully convinced companies actually loose money because they have bunch of employees who waste time “bending reality” thinking they need custom workflow because “they are so specialized”.
Ian Rush said it best: "It's best being a striker. Miss five, score the winner, you're a hero. The goalkeeper plays a blinder, lets one in, and he's a villain."
Every place I've worked rewards the firefighter over the person who made sure nothing ever caught fire. And the worst part is the math is obvious to everyone except the people who set the incentives.
How would you set the incentives, though? Almost by definition, it's hard to reward things that aren't visible.
Note that there is also the flip side of the coin, people who spend all their time worrying about things that never happen, so it's not like you can just reward a defensive attitude things are more complicated than that.
The equivalent of 'days since last injury' bonuses is the first method that comes to mind, until you consider that this would mean people would be more likely to hide things going wrong.
So then many things are to rely on executive culture, and an executive who will walk the line and get their info from people at the bottom is like a unicorn. That won't scale, but it does work if you do have such an executive. Naturally they would need a basic understanding of how supply is created in their firm.
Yet there is something. Toyota Hiluxes and Honda Super Cubs got popular due to maintenance ease. AK-47s. Miele vacuums. Older Thinkpads.
What measures would make the human equivalent visible?
One basic example is not counting bugs as points in your ticket tracker. At my last job I had coworkers whose velocity was almost double everyone else’s but it was because they kept deploying and then fixing their own bugs.
I dont have an answer and you are mostly correct. I received some advice based on this that made sense which was to pick the roles in your career that naturally made it easier. Sales, PM, Dev etc and not support, Devops, escalation management, CSM etc.
I dont think this comparison really works. Firefighter would be goalie or a defender and like you said in sports they are less appreciated/compensated for a simple reason - usually they don’t bring in views. There are exceptions ofc like Pippen or Seaman
Wondering if enterprises have a modified version of CC that doesnt have to optimize to stop bleeding on fixed cost subscription plans.
The article really does not align with the current sentiment. Everyone with a choice has mostly moved on to codex (ofc in this world all it takes is a model update/harness update to turn things around).
CC is great at a lot of things, but repeatedly misses out reading on crucial parts of the code base, hallucinates on the work that was done and a bunch of other issues.
The influencer economy trades on hype, on frenzy, and ultimately, eyeballs. The more the better.
They want you feel like you’re missing out. They want you to switch. Being boring is far more productive. Pin your versions. Stick to stable releases and avoid the nightlies.
Significant noise created from 4.6 to 4.7 Opus transition has caused some to interpret this as signal. Excluding certain genuine and real bugs, the noise about perceived quality falling dramatically was noise. Influencers doing influencing turned it into “signal”. The reality was that if you had strong planning and spec driven development it ranged from manageable to non-existent.
The vast majority of the people I know and work with have not switched off CC or their Max sub.
I have a choice and have not moved to codex (100/mo personal + my employer pays for a subscription). I try codex here and there and it seems to go off the rails every time. I have had some good experiences with codex, but generally trying to get something big accomplished it doesn't work out.
But I may not have paid enough to get the full real experience with codex
I use codex at home 20 bucks a month the limits are very high relative to the price, maybe the gravy train ends soon for these and then it's probably to open router chinese models.
At work it's CC or sometime codex, personally don't see much difference at all and most normies will notice none. The cultists have their opinions.
What bleeding? Anthropic wants as much of that "bleeding" as possible. The interaction data gathered from genuine human CC subscription usage of their models goes directly into their RL training, it's invaluable and they are more than happy to lose money on the inference to get it. That data is what xAI was recently willing to pay $10b to cursor to get.
They want you to use Claude Code. They hate other UI surfaces like OpenCode etc purely because they lose control over that data, so they're subsidizing the inference without getting what they actually want, the data (they still get some of it of course, but it's much less ergonomic for them. Those tools often abstract away the subagent calls, for example). OpenCode can collect that data themselves, so by allowing subscription there, Anthropic sees itself as subsidizing another org getting that data. Hard no.
And tools like OpenClaw are useless because they're mechanical and don't represent actual users interacting with the service - again, subsidizing but not getting the reward.
It's all very simple once you understand their motivations.
I think it's a good rule of thumb that if you find yourself saying everyone prefers this model or that model you're in a bubble. I've made this mistake before, I used to go around saying everyone knew Claude was the only model for serious professional use, but I was wrong.
I always assume that people making those comments on HN are trying to convince others to switch to their model. Surely no one actually believes their friend circle is a representative sample of the hundreds of millions of people that use these LLMs?
Btw the guy in charge of that stuff for Anthropic is the same guy who said GPT 2 was too dangerous to release, Jack Clark. LMAO. That model could barely string a sentence together.
You must be using a different CC. Or what they’re writing here is correct, and it’s all due to the CLAUDE.md file that I only occassionally yell at claude.
Hmm please share more. I have had the max CC sub since it came out. Religiously follow all of Boris/Cats advice but still struggle with it. Meanwhile a really badly written AGENTS.md will still get the work done.
I find that most “techniques” are basically user hallucinations. Simple plan-write-refactor loops and trivial CLAUDE/AGENTS.md, generated by the harness itself, work nicely. Maaaaaaaaaybe write a skill or two, but usually it’s better to just write a script.
Would be nice to see if this number dipped from before. International students typically end up paying out of station tuition and is a huge source of income for the univs.
This is not true for PhD programs in top-ranked institutions. It may have been true 20+ years ago, but today it is very difficult to buy your way into a graduate program.
That is much less true of grad programs in technical fields. Undergrad, international students are indeed more likely to pay full-boat--or at least larger boat--than US applicants.
Docker can be small too. In this example I was able to compile a full server (rust binary) and package it in a docker (scratch image base) and the total was < 5MB.
Great question, @Tsarp - Skill and tools work great together. What we've found is that agents generally need both to achieve great results. We're actually not trying to replace skills, but to give them new super powers.
Are there any examples you've run into where skills were missing tools (or data) that they needed for a specific task?
Hmm, hoping this isn't a generic LLM generated response.
Skills have the scripts folder and you can precisely describe when and when not to use a script. This can end up directly wrapping API(s), CLIs, generic scripts or even other MCP servers.
CC and codex both have the skill creator and you can have them build the skill for you.
Havent run into any scenarios where skills were missing tools. 1-2 iterations and its usually taken care off quite quickly.
Hey, fair enough. (100% human here, btw.) I think I misread your original question to be asking "why do we need a service (whether accessed via API/SDK/MCP/etc.)" vs just having skills (markdown + scripts)".
If you are already leveraging skills as scripts and APIs in your skills, then you understand the distinction. I'll attempt to re-answer your question with now hopefully a better understanding:
I think Airbyte Agents helps your agent by giving access to data across any and all of the systems it may need to get data from, or write data to. While you could hit the service APIs directly (via REST/CLI/etc.), in practice we find that not all use cases are amenable to this. Airbyte Agents does have REST APIs as well as SDKs and of course the MCP interface - so it's not really about MCP tools specifically, more about how you can access the data. The Airbyte Agents interface also reduces the number of creds that the agent needs to handle, giving a single portal (with logging and audit capabilities) for all the actions your agent is taking.
Sorry for the red herring of skills-v-tools. Let me know if you have any additional questions!
Everyone ends up with a workflow shaped really tightly around how they work, and it's gotten so cheap to just build and evolve your own as the models and harnesses change that picking up someone else's stops making much sense.
reply