Hacker Newsnew | past | comments | ask | show | jobs | submit | qudat's commentslogin

Preload is the answer to speed. Basically download the clients db on init and then have cache invalidation strategies. I built starfx to basically perform the data sync aspect of this paradigm https://starfx.bower.sh/learn#data-loading-strategy-stale-wh...

I really like this idea and have been experimenting with it over a week or so.

I think there’s an opportunity to use an AST diff system for code forges where you don’t present the user with line diffs in the UI — or at least not as the first diff the user sees.

I firmly believe code review should happen in your editor.


Really glad you've been using it, and yeah that's exactly the direction I've been thinking about. The line diff as the default view in code forges has always felt like an accident of history it definitely was easy to compute, but not what's actually useful for understanding what changed.

I'm still thinking this through but I was arguing this position to colleagues to some shock: LLM's are a race-to-the-bottom and frontier models will not be able to afford to work on coding specific models (or coding features at all) in the very near future.

27B is already really good at coding-specific tasks. Fundamentally, there is little innovation on the core architecture: LLMs are all designed essentially the same, with minor differences in how they are trained. They are all feed-forward multi-headed attention models; it doesn't matter if it's a 4B model or a 1T model, that's just scale.

Further, the frontier models cannot afford to innovate: they have to scale as quickly as possible to "beat out" their competition. The frontier models fundamentally will not create the next "attention is all you need" monumental jump in AI.

Frontier companies are stuck on scale with zero capacity to innovate. You cannot point capitalism at "basic science research" and expect any ROI. This is a known reality. Innovation is much more indirect and a "random walk" style of knowledge acquisition.

Finally, these LLMs are quite literally designed with a human-in-the-loop, and we do not give ourselves enough credit for how well we ourselves tool-call. We are doing a lot of heavy lifting to make these models useful and you cannot simply remove us from the equation without also removing ourselves from the training pipieline.


There hasn't ever been in human histore more incentive to innovate than today, and you think, the best lab won't innovate. That is crazy. It is like anyone can do AI research. Of course there will be new architectures. We just discovered the steam engine and the combustion engine is coming.


I built https://zmx.sh to make it easier to interact with your terminal sessions programmatically. 1 window = 1 session which might feel like a negative but it makes programmatic access easy and agents can use it just by pointing it at the zmx help command. Basically, an agent just needs 2 commands (run and write) for full control and the commands are synchronous so you don’t need to do any polling.


nice, zmx is a very nice approach. I did not want to take risks and change the expected behavior of the renown tmux. My main goal was 1) to rewrite tmux the modern way more elegantly (async, rust) and give it windows support 2) to give it a SDK. I think 1 window = 1 session is the right way, but it's a risky market move I didn't want to tackle for adoption. Maybe on a breaking v2 !


And a labeling action which requires `pull_request_target`: https://github.com/actions/labeler#create-workflow

These types of features are not worth it and need to be removed from the marketplace.


Yep. I’m approaching the same problem from a different angle: writing code fast means you aren’t being thoughtful about the features you’re building. I started realizing that after I had kids and spent more time thinking about code than writing it and it really improved the quality of my work: https://bower.sh/thinking-slow-writing-fast


Nah. These agents are getting easier and easier to run local. Have you tried Qwen 3.6 27b? It’s insane what it can do compared to its size. Like 100% vibe small projects if you manage context properly.

These models are a race to the bottom just like compute.


I don’t think it matters. Local matters becoming better has not stopped demand for SOTA models.


My guess is it won’t be worth it to focus specifically on coding models once local small models work just as well or within range. That will naturally close the gap even more


I don’t understand: just use an agent to find all memory leaks and segfaults. I don’t get the argument if you are gonna vibe code anyway.

With unlimited tokens make it a lint rule or auto formatter.


LLMs are a force multiplier, not magic. They benefit from good tooling.


Literally the model “mythos” is being marketed towards finding these exact type of bugs used for exploitation. I really don’t understand the argument: are agents not good at findings memory management issues? What’s the gap?


We take a slightly different approach for https://pico.sh -- no automatic subscription, but we charge for an entire year. It's great for us because each sub is a year and if someone truly isn't using our services then it'll quietly drift into the background for the end-user.


The entire concept that we need to cater CLIs to agents at all should tell us how far away they are from being “junior devs” or “an intern” and I reject the premise.

A lack of structured output has never been a blocker for agents to work, that’s a traditional coding problem.

“Write good help text and error messages” is just good design which is self evident.


not really.. I never understand the inclination to be reductive. the patterns emerging can be fairly novel.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: