Hacker Newsnew | past | comments | ask | show | jobs | submit | throwaway2027's commentslogin

The same people that hyped up Claude will also hype up better alternatives or speak out against it, seems more like you're being disingenuous here.

Gemini and Codex already scored higher on benchmarks than Opus 4.6 and they recently added a $100 tier with limited 2x limits, that's their answer and it seems people have caught on.

> that's their answer and it seems people have caught on.

There's nothing to catch on to. OpenAI have been shouting "come to us!! We are 10x cheaper than Anthropic, you can use any harness" and people don't come in droves. Because the product is noticeably worse.


> and people don't come in droves. Because the product is noticeably worse.

As of Oct 2025, it appears that openai markets share is 15x that of anthropic: 60% vs 3.5% [1].

As of April 2026, openai has 900 million weekly users [2] while anthropic has 300 million monthly users [1].

As of March 2026, openai app downloads were 2.2 million per day, while anthropic app downloads were 340,000. openai mobile users were 248 million per day, while anthropic mobile users were 9.4 million. In Feb 2026, chatgpt had 5.4 billion web visits, while claude had 290 million web visits. [3]

It seems to me that openai operates at a much higher scale than anthropic. Since you used droves as a proxy for product quality, by that standard anthropic has a much more inferior product. :)

[1] https://sqmagazine.co.uk/claude-vs-chatgpt-statistics/ [2] https://www.pbs.org/newshour/nation/openai-focuses-on-busine... [3] https://www.forbes.com/sites/conormurray/2026/03/06/claude-s...


Even using Mythos with their own benchmarks as a comparison that isn't available for most people to use, what a joke.

True but I guess their primary customers are businesses not individual devs. Maybe Mythos is more affordable for them

The only way it’s more affordable is if anthropic burns cash to keep their corporate clients.

You're better off subscribing to Codex for April and May of 2026.



Already priced in.

The author also noted:

> yes this has to buy below 0.73 long term, the bot has a configurable ceiling set at 0.65 and checks for new markets buying closer to .5

https://x.com/sterlingcrispin/status/2043685362812461436


For this question I'm working on https://polygains.com

What other question would you like to be backtested? This one is fairly easy


For every bucket of probability, what is the chance it resolves correctly?

For example, for markets that are between 60 and 70, is it the case that around 65% of them resolve to yes?

I guess you want to take a certain time before out finishes, so focus on sports.


What happens if you flood the market with a bunch of implausible bets like "sun won't rise tomorrow"? Sure, you might try to filter that out with some sort of "seasoning" period (ie. don't buy new markets), but then that means more time for arbitrageurs to correctly price the market, depriving you of any price advantage you might have had.

There are quite a few of those, my favourite: https://polymarket.com/event/will-jesus-christ-return-before...

If you put all your money on no, you get 4% if you win, and if Jesus comes back and you lose, money won't matter.


>you get 4% if you win,

This locks up your money in the meantime, right? If so, considering the fed funds rate is 3.64% (and you can probably get higher rates on stablecoins), a huge chunk of those "winnings" is going to be eaten up by the opportunity cost of the money.


You forget that Polymarket is just a casino, and the house always wins.

For example, recent events show that any bet can be selectively disputed by arbitrary reason ("we found insiders", "we found this immoral/illegal", etc.).

And for perpetual events - there is not a single week without a hack (https://www.web3isgoinggreat.com/)


jane street is always hiring!

> What happens if you flood the market with a bunch of implausible bets like "sun won't rise tomorrow"?

Who does "you" refer to in this sentence? Polymarket itself?

I'm pretty sure if Polymarket itself decides it wants to screw you, you're gonna lose no natter what your strategy is.


That presumes that there are people selling into new markets at 0.5 without thinking about the actual odds.

Of course, once you add a condition like that, the probabilities change....

Except that the mere existence of the market with the question posed for people to consider, probably activates the availability heuristic[1], causing people to overestimate the likelihood.

[1] https://philopedia.org/topics/availability-heuristic/


"Nothing ever happens"

Benjamin, huh?

Thanks!

I don't want a nudge. I want a clear RED WARNING with "You've gone away from your computer a bit too long and chatted too much at the coffee machine. You're better off starting a new context!"

I don’t want a scary red message chastising me for not being responsive enough!

I often leave CC hanging (or even suspended) and use /resume a lot. I’m okay with that having some negative effect on my token limits.

Product design is hard. They can’t please us all. I don’t envy the team considering these trade offs.


Is it that hard though? This kinda smacks of no research on users prior to rolling stuff out.

Ack, it is currently blue but we can make it red

I think after the TTL expires the session should be autocompacted and the user should given a choice to continue with compacted version or be hit with the full read cost of continuing with their large but expired context. At the moment users are blind what is going on.

Why is nobody even asking why that should be an issue? No other text editor shits the bed that way. The whole point of the computer is that it patiently waits for my input.

let me put this way: not your ram, not your cache, not waiting patiently for your input.

Good thing they're not charging for it, then.

Good thing they didn't silently, quietly change cache from 1 hour to 5 minutes, right?

forget the warning, just compact like someone suggested in the ticket. Who would opt for a massive cache miss?

Some claim that some of the recent smaller local models are as good as Sonnet 4.5 of last year and the bigger high-end models can be as almost as good as Claude, Gemini and Codex today, but some say they're benchmaxed and not representative.

To try things out you can use llama.cpp with Vulkan or even CPU and a small model like Gemma 4 26B-A4B or Gemma 4 31B or Qwen 3.5 35-A3B or Qwen3.5 27B. Some of the smaller quants fit within 16GB of GPU memory. The default people usually go with now is Q4_K_XL, a 4-bit quant for decent performance and size.

https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF

https://huggingface.co/unsloth/gemma-4-31B-it-GGUF

https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF

https://huggingface.co/unsloth/Qwen3.5-27B-GGUF

Get a second hand 3090/4090 or buy a new Intel Arc Pro B70. Use MoE models and offload to RAM for best bang for your buck. For speed try to find a model that fits entirely within VRAM. If you want to use multiple GPUs you might want to switch to vLLM or something else.

You can try any of the following models:

High-end: GLM 5.1, MiniMax 2.7

Medium: Gemma 4, Qwen 3.5

https://unsloth.ai/docs/models/minimax-m27

https://unsloth.ai/docs/models/glm-5.1

https://unsloth.ai/docs/models/gemma-4

https://unsloth.ai/docs/models/qwen3.5

https://github.com/ggml-org/llama.cpp


Thank you, I'll look into it. For someone who is used to just working with second hand thinkpads, this stuff gets expensive fast!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: