Hacker Newsnew | past | comments | ask | show | jobs | submit | jjallen's commentslogin

This is definitely Barbara Streisanding right now. I had never heard of OpenCode. But I sure have now! Will have to check it out. Doubt I’ll end up immediately canceling Claude Code Max, but we’ll see.

I don’t know if the Streisand Effect is relevant here since Anthropic will block any other uses of their private APIs, not just OpenCode. The private Claude Code API was never advertised nor sold as a general purpose API for use with any tool.

OpenCode is an interesting tool but if this is your first time hearing of it you should probably be aware of their recent unauthenticated RCE issues and the slow response they’ve had to fixing it: https://news.ycombinator.com/item?id=46581095 They say they’re going to do better in the future but it’s currently on my list of projects to keep isolated until their security situation improves.


Imo I don't trust ANY of these tools to run in non-isolated environments.

All of these tools are either

- created by companies powered by VC money that never face consequences for mishandling your data

- community vibecoded with questionable security practices

These tools also need to have a substantial amount of access to be useful so it is really hard to secure even if you try. Constantly prompting for approval leads to alert fatigue and eventually a mistake leading to exfiltration.

I suggest just stick to LXC or VM. Desktop (including linux) userland security is just bad in general. I try to keep most random code I download for one off tasks to containers.


I'm trying to put together an exe.dev-like self hosted solution using Incus/LXC. Early days but works as a proof of concept:

https://github.com/jgbrwn/shelley-lxc


Incus is great for this use case, I did something similar. I volume mount specific stuff into the guests and let OpenCode loose with all tools enabled.

I used OpenCode to vibe code the shell script I use to manage it.

I actually use VMs rather than LXC, which makes it easier to run e.g. docker.


Very cool. I think docker also runs fine inside of LXC, but haven't experimented too much with that specifically yet.

I might go back and give it a try! It would certainly save some ram.

I immediately reached for VMs because I just didn't want any question about the full level of isolation, but the cool thing about incus is that it should be easy to switch between them.


A coding agent is just a massive RCE, what do you think happens when claude gets prompt injected? Although I don't defend not fixing an RCE.

Absolutely all coding agents should be run in sandboxed containers, 24/7, if you do otherwise, please don't cry when you're pwned.


OpenCode is kind of a security disaster though: https://news.ycombinator.com/item?id=46581095. To be clear, I know all software has bugs, including security bugs. But that wasn't an obscure vulnerability, that was "our entire dev team fundamentally has no fucking clue what they're doing, and our security reporting and triage process is nonexistent". No way am I entrusting production code and secrets to that.

So is Claude. They nuked everyone's claude app a few days ago by pushing a shoddy changelog that crashed the app during init. Team literally doesnt understand how to implement try...catch. The thing clearly was vibe coded into existence.

Last week Claude Code (CC) had a bug that completely broke the Claude Code app because of a change in the CC changelog markdown file.

Claude Code’s creator has also said that CC is 100% AI generated these days.


agreed. This is definitely free PR for OpenCode. I didn't try it myself until I heard the kerfuffle around Anthropic enforcing their ToS. It definitely has a much nicer UX than claude-code, so I might give the GPT subscription a shot sometime, given that it's officially supported w/ 3rd party harnesses, and gpt 5.2 doesn't appear to be that far behind Opus (based on what other people say).

Very cool. Was thinking about working onthis myself after moving in a house 4 months ago with these to all of a sudden ahve to replace them for no good reason.


Exactly. If it is used a certain way by enough people, that is also an accepted definition. Dictionaries lag actual speech and language I suppose.


> If it is used a certain way by enough people, that is also an accepted definition.

This mentality seems to be prevalent in the USA, in Germany, on the opposite, many people see this topic differently - just because a lot of people use a certain word/term wrong does not make it right.


And it annoys me endlessly. People can't let go of the genitive, even if it's dead in loads of dialects.

If people knew how many words were just "made up" in the last couple centuries to match the vocabulary of Latin or French... they'd lose their mind


+1 for cudarc. I've been using it for a couple of years now and has worked great. I'm using it for financial markets backtesting.


Because of public family trees potentially linking a genome to a family, no dna is fully anonymous these days.


The DNA itself is not "anonymous", but I would do it without giving my real name, address, etc. They could know who the DNA is related to, but not gain more information than that.

Even better would be to swap identity with someone else who wants to get sequenced...


They would be able to pinpoint your identity (e.g. "this person is the son of both X and Y, and we know who X and Y are").


And what would that gain them? "X and Y had a son"?


They know who X and Y are, and also know the identity of their son (you), so that gains them your unique DNA sequence, identified as yours specifically.


Yeah, I think you're missing the whole point of the "anonymously" part. :-)


How do you plan to do it anonymously, considering what you now know?:

1. There are already multiple database containing both your parents, you, and a linkage between you and them indicating parentage. So, prior knowledge: Alice and Bob are parents of Charlie.

2. If Charlie's parents have taken a DNA test, there already exists a database linking their DNA to their name. So, prior knowledge: Alice's DNA belongs to Alice, Bob's DNA belongs to Bob.

3. If Charlie takes a DNA test totally anonymously and perfectly untraceably, it will still show up as, child of Alice and Bob's DNA. So, knowledge now includes: Charlie's (anonymous) DNA is the son of Alice and Bob's DNA

4. From these pieces of information, it is trivial to de-anonymize Charlie's DNA, linking it to Charlie's identity: the only person it could belong to is the son of Alice and Bob, and the son of Alice and Bob is already known from point 1.


Ah, I see what you're saying!

I think in my case I'm just not that concerned by the hypothetical because my parents haven't done sequencing/genetic screening and also aren't likely to. I guess the main question is how far out in my family tree I have to think about that. (Also has implications for my descendants, I suppose...)


Clearly not.


Are they going to reinvest these funds into educations so our country can fill these roles or just waste it on weapons and unwinnable wars?

I would be totally fine with this if it was the former, but I would bet that it won't be...


I have not and don’t run an adblocker fwiw.


Just out here raw-dogging the internet...


Try new and improved Bongo Buddy(tm)!


And do you notice high CPU usage or stuttering?


Very cool. I learned a lot as a non dermatologist but someone with a sister who has had melanoma at a very young age.

I went from 50% to 85% very quickly. And that’s because most of them are skin cancer and that was easy to learn.

So my only advice would be to make closer to 50% actually skin cancer.

Although maybe you want to focus on the bad ones and get people to learn those more.

This was way harder than I thought this detection would be. Makes me want to go to a dermatologist.


Thanks, this is a good point - I think a 50:50 balance of cancer versus harmless lesions would be better and will change this in a future version.

Of course in reality the vast majority of skin lesions and moles are harmless and the challenge is identifying those that are not and I think that even a short period of focused training like this can help the average person to identify a concerning lesion.



> So my only advice would be to make closer to 50% actually skin cancer.

If I were to code this for "real training" of a dermatologist, I'd make this closer to "real world" training rate. As a dermatologist, I'll imagine that probably just 1 out of 100 (or something like that) skin lesions that people could imagine are cancerous, actually are so.

With the current dataset, there're just too many cancerous images. This makes it kind of easy to just flag something as "cancerous" and still retain a good "score" - but the point is moot, if as a dermatologist you send _too many_ people without cancer to do further exams, then you're negating the usefulness of what you're doing.


It needs a specific scoring system where each false positive has a lower score drop, but false negative has a huge one. At the same time like you said positives would be much rarer. Should be easy to ask LLM to vibe code that so it would simulate real world and its consequences.


Thought about this some more. I think you want to start at 100% or high so people actually learn what needs to be learned: what malignant skin conditions actually look like.

And then once they have learned you get progressively harder and harder. Basically the closer to 50% you are the harder it will be to have a score higher than chance/50%.


I found the first dozen to be mostly cancer and then the next dozen were mostly non-cancer. (Not sure if it's randomized.) (Also, I'm really bad at identifying cancerous vs non-cancerous skin lesions.)


It is randomized so probably just bad luck! FWIW I get a high score and another skin cancer doctor who commented also gets a high score so it is possible to make the diagnosis in most cases on the basis of these images.


I have gone from using Claude Code all day long since the day it was launched to only using the separate Claude app. In my mind that is a nice balance of using it, but not too much, not too fast.

there is the temptation to just let these things run in our codebases, which I think for some projects is totally fine. For most websites I think this would usually be fine. This is for two reasons: 1) these models have been trained on more websites than probably anything else and 2) if a div/text is off by a little bit then usually there will be no huge problems.

But if you're building something that is mission critical, unless you go super slowly, which again is hard to do because these agents are tempting to go super fast. That is sort of the allure of them: to be able to write sofware super fast.

But as we all know, in some programs you cannot have a single char wrong or the whole program may not work or have value. At least that is how the one I am working on is.

I found that I lost the mental map of the codebase I am working on. Claude Code had done too much too fast.

I found a function this morning to validate futures/stocks/FUT-OPT/STK-OPT symbols where the validation was super basic and terrible that it had written. We had implemented some very strong actual symbol data validation a week or two ago. But that wasn't fully implemented everywhere. So now I need to go back and do this.

Anyways, I think finding where certain code is written would be helpful for sure and suggesting various ways to solve problems. But the separate GUI apps can do that for us.

So for now I am going to keep just using the separate LLM apps. I will also save lots of money in the meantime (which I would gladly spend for a higher quality Claude Code ish setup).


The reality is that you can't have AI do too much for you or else you completely lose track of what is happening. I find it useful to let it do small stupid things and use it for brainstorming.

I don't like it to do complete PR's that span multiple files.


I don't think the "complete PR spanning multiple files" is an issue actually.

I think the issue is if you don't yourself understand what it's doing. If all you do is to tell it what the outcome should be from a user's perspective, you check that that's what it does and then you just merge. Then you have a problem.

But if you just use it to be faster at getting the code you would've liked to write yourself, or make it write the code you'd have written if you had bothered to do that boring thing you know needs to be done but never bothered to do, then it's actually a great tool.

I think in that case it's like IDE based refactorings enabled by well typed languages. Way back in the day, there were refactorings that were a royal pain in the butt to do in our Perl code base. I did a lot of them but they weren't fun. Very simple renames or function extractions that help code readability just aren't done if you have to do them manually. If you can tell an IDE to do a rename and you're guaranteed that nothing breaks, it's simply a no brainer. Anyone not doing it is simply a bad developer if you ask me.

There's a lot of copy and paste coding going on in "business software". And that's fine. I engage in that too, all the time. You have a blueprint of how to do something in your code base. You just need to do something similar "over there". So you know where to find the thing to copy and paste and then adjust. The AI can do it for you even faster especially if you already know what to tell it to copy. And in some cases all you need to know is that there's something to copy and not from where exactly and it'll be able to copy it very nicely for you.

And the resulting PR that does span multiple files is totally fine. You just came up with it faster than you ever could've. Personally I skipped all the "Copilot being a better autocomplete" days and went straight into agentic workflows - with Claude Code to be specific. Using it from within IntelliJ in a monorepo that I know a lot about already. It's really awesome actually.

The funny thing is that at least in my experience, the people that are slower than you doing any of this manually are not gonna be good at this with AI either. You're still gonna be better and faster at using this new tool than they were at using the previously available tools.


> You just need to do something similar "over there". So you know where to find the thing to copy and paste and then adjust. The AI can do it for you even faster especially if you already know what to tell it to copy. And in some cases all you need to know is that there's something to copy and not from where exactly and it'll be able to copy it very nicely for you.

The issue with this approach is the mental load of verifying that the it did the thing you asked for correctly. And that it did not mess something like a condition expression.

My belief is that most developers don't interact with their code more than character on the screens. Their editing process is clicking, selecting, and moving character by character. Which make their whole experience painful for anything that involves a bit of refactoring.

When you exploit things like search based navigation (project or file based), indexing (LSP or IDE intellisense), compilers/linters/test runners report (going directly to the line mentioned), semantic navigation and manipulation (keyboard based), and the knowledge of few extra tools like (git, curl, jq,...) you'll have a far pleasant experience with coding. Editing is effortless in that case. You think about a solution and it's done.

Coding is literally the most enjoyable part of the job for me. What's not enjoyable is the many WTFs when dealing with low quality code and having to coax specifications from teammates.


    The issue with this approach is the mental load of verifying that the it did the thing you asked for correctly. And that it did not mess something like a condition expression.
That's fair to an extent and what I've commented on before as well: AI can make the enjoyable part of coding "go away" and replace it with the menial and unfun part: Code review.

The "trick" would be to make it more like a pair programming session than code review.

    Which make their whole experience painful for anything that involves a bit of refactoring.
Also agreed! So many times when pairing with others it's like that. It's very painful to see other people debug in many cases. Or write code / interact with their tooling. But then there are also some where it's a ray of light. People that know their tooling just as well as you do or maybe even better and you learn a thing or two.

I love it when I come out of a pairing session and I've learned something that I can incorporate.

And it pains me when I've used something, maybe even specifically called out how I do something, for the n-th time with someone and they still don't catch on. And it doesn't matter if they don't pick it up by themselves or whether it's something that's one of their improvements to work on because we literally talked about it in the last seven 1:1s or something. Some people "just don't get it" unfortunately. Some people really just aren't cut out to be devs. AI or not.

    Editing is effortless in that case. You think about a solution and it's done.
Yes but ;) As in, agreed on effective tool use being awesome but unfortunately more rare than I would like. But there are other people "like you and me" out there. Sometimes we have the fortune to work with them. It's such a delight! I love working with them. I love just working with someone that's on the same level and we can pair on an equal level and get shit done. It's rare though.

It's not just done though. It's still work in many cases and some of that really can be improved with this new tool: AI. Just like we were able to replace a 30 minute Perl refactoring done manually with a few seconds IDE refactoring in Kotlin (or whatever language floats your boat/happens to be used where you are)

     low quality code and having to coax specifications from teammates.
I'm not sure I understand this part to be honest. I don't usually coax specifications from teammates. I coax them from Product people or customers and while it's not really the most fun sometimes, personally, I do find joy in the fact that I am delivering something that helps the customer. I enjoy fixing a bug both because I like the hunt for the root cause (something AI really isn't great at doing by itself from my experience yet - but I do enjoy working with it) and because I like it when I can deliver the fix to the customer fast. Customer reported a bug this morning and by the end of the day they have a fix. That's just awesome. Cloud FTW. Gone are the days of getting assigned a bug someone triaged 6 months ago and it will go out with a release 3 months from now, ensuring the customer gets the fix installed a year plus from when they reported it (coz of course their admins don't install a new release the day it comes out, right?)


> I don't usually coax specifications from teammates. I coax them from Product people or customers and while it's not really the most fun sometimes, personally, I do find joy in the fact that I am delivering something that helps the customer.

It's when you're dependent on a service and there's no documentation. Even if you can read the code (and if you can't, you probably should learn), it's better to ask the person that worked on it (instead of making too many assumptions). And that's when the coaxing comes into play.


Fully agreed.

In my view, effective coding agent use boils down to being good at writing briefs as you would for any ticket. The better formatting, detail, and context you can provide BOTH on an outcome level and a technical architecture level, the better your results are.

To put it another way: If before LLMs came along you were someone who (purposely or otherwise) became good at writing for documentation and briefing tickets for your team, I think there's a decent chance you're going further with these agentic tools than others that just shove an idea into it and hope for the best.


Losing the mental map is the number one issue for me. I wonder if there could be a way to keep track of it, even at a high level. Keeping the ability to dig in is crucial.


Spend time reviewing outputs like a tech lead does when managing multiple developers. That's the upgrade you hust got in your career, you are now bound to how many "team members" you can manage at a single time. I'm grateful to live in such a time.


The code is the mental map. Orchestra conductors read and follow the music sheet as well. They don't let random people comes in and mess with. Neither do film directors with their scripts and their plans.


> I have gone from using Claude Code all day long since the day it was launched to only using the separate Claude app. In my mind that is a nice balance of using it, but not too much, not too fast.

I’m on a similar journey - I never used it all day long but definitely a lot during a brief honeymoon period and now I’m back to using it very sparingly but I put questions to the Claude app all the time

For me the sweet spot for Claude code is when I have a very clear and well documented thing to set up that I really don’t want to do for the umpteenth time - like webhook signature verification - just paste the docs and let it rip - or setting up the most basic crud forms for an admin dashboard - ezpz

But otherwise I’ve gone back to mostly writing everything by hand


You need to spend more time in Plan mode. Ask it to make diagrams or pseudocode of whats and hows, iterate on that and then Accept Edits.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: