More

throwaway2027 · 2026-04-16T16:24:30 1776356670

The same people that hyped up Claude will also hype up better alternatives or speak out against it, seems more like you're being disingenuous here.

throwaway2027 · 2026-04-16T16:20:05 1776356405

Gemini and Codex already scored higher on benchmarks than Opus 4.6 and they recently added a $100 tier with limited 2x limits, that's their answer and it seems people have caught on.

deaux · 2026-04-16T17:56:53 1776362213

> that's their answer and it seems people have caught on.

There's nothing to catch on to. OpenAI have been shouting "come to us!! We are 10x cheaper than Anthropic, you can use any harness" and people don't come in droves. Because the product is noticeably worse.

ninjagoo · 2026-04-17T10:27:16 1776421636

> and people don't come in droves. Because the product is noticeably worse.

As of Oct 2025, it appears that openai markets share is 15x that of anthropic: 60% vs 3.5% [1].

As of April 2026, openai has 900 million weekly users [2] while anthropic has 300 million monthly users [1].

As of March 2026, openai app downloads were 2.2 million per day, while anthropic app downloads were 340,000. openai mobile users were 248 million per day, while anthropic mobile users were 9.4 million. In Feb 2026, chatgpt had 5.4 billion web visits, while claude had 290 million web visits. [3]

It seems to me that openai operates at a much higher scale than anthropic. Since you used droves as a proxy for product quality, by that standard anthropic has a much more inferior product. :)

[1] https://sqmagazine.co.uk/claude-vs-chatgpt-statistics/ [2] https://www.pbs.org/newshour/nation/openai-focuses-on-busine... [3] https://www.forbes.com/sites/conormurray/2026/03/06/claude-s...

deaux · 2026-04-17T17:29:17 1776446957

Sir, this is ~a Wendy's~ talking about paid use for agentic usecases, especially as individuals for work or as small-medium sized companies. Not about people asking chat their horosocope for the next week. Yes OpenAI still has the horoscope market in tight control, great for them. Do read the room please.

throwaway2027 · 2026-04-16T16:17:20 1776356240

Even using Mythos with their own benchmarks as a comparison that isn't available for most people to use, what a joke.

solenoid0937 · 2026-04-16T17:10:33 1776359433

True but I guess their primary customers are businesses not individual devs. Maybe Mythos is more affordable for them

therobots927 · 2026-04-16T17:34:31 1776360871

The only way it’s more affordable is if anthropic burns cash to keep their corporate clients.

throwaway2027 · 2026-04-16T16:11:18 1776355878

You're better off subscribing to Codex for April and May of 2026.

throwaway2027 · 2026-04-16T10:30:27 1776335427

https://en.wikipedia.org/wiki/XOR_linked_list

throwaway2027 · 2026-04-15T15:05:25 1776265525

https://github.com/ggml-org/llama.cpp

throwaway2027 · 2026-04-13T16:07:58 1776096478

Already priced in.

m-hodges · 2026-04-13T16:19:03 1776097143

The author also noted:

> yes this has to buy below 0.73 long term, the bot has a configurable ceiling set at 0.65 and checks for new markets buying closer to .5

https://x.com/sterlingcrispin/status/2043685362812461436

hoerzu · 2026-04-13T16:43:47 1776098627

For this question I'm working on https://polygains.com

What other question would you like to be backtested? This one is fairly easy

lordnacho · 2026-04-13T18:27:15 1776104835

For every bucket of probability, what is the chance it resolves correctly?

For example, for markets that are between 60 and 70, is it the case that around 65% of them resolve to yes?

I guess you want to take a certain time before out finishes, so focus on sports.

gruez · 2026-04-13T16:27:00 1776097620

What happens if you flood the market with a bunch of implausible bets like "sun won't rise tomorrow"? Sure, you might try to filter that out with some sort of "seasoning" period (ie. don't buy new markets), but then that means more time for arbitrageurs to correctly price the market, depriving you of any price advantage you might have had.

fer · 2026-04-13T18:14:36 1776104076

There are quite a few of those, my favourite: https://polymarket.com/event/will-jesus-christ-return-before...

If you put all your money on no, you get 4% if you win, and if Jesus comes back and you lose, money won't matter.

gruez · 2026-04-13T18:40:54 1776105654

>you get 4% if you win,

This locks up your money in the meantime, right? If so, considering the fed funds rate is 3.64% (and you can probably get higher rates on stablecoins), a huge chunk of those "winnings" is going to be eaten up by the opportunity cost of the money.

Lockal · 2026-04-14T10:47:40 1776163660

You forget that Polymarket is just a casino, and the house always wins.

For example, recent events show that any bet can be selectively disputed by arbitrary reason ("we found insiders", "we found this immoral/illegal", etc.).

And for perpetual events - there is not a single week without a hack (https://www.web3isgoinggreat.com/)

baq · 2026-04-13T16:27:55 1776097675

jane street is always hiring!

Bratmon · 2026-04-13T17:32:47 1776101567

> What happens if you flood the market with a bunch of implausible bets like "sun won't rise tomorrow"?

Who does "you" refer to in this sentence? Polymarket itself?

I'm pretty sure if Polymarket itself decides it wants to screw you, you're gonna lose no natter what your strategy is.

gowld · 2026-04-13T18:55:44 1776106544

That presumes that there are people selling into new markets at 0.5 without thinking about the actual odds.

zahlman · 2026-04-13T20:09:20 1776110960

Of course, once you add a condition like that, the probabilities change....

jp57 · 2026-04-13T16:17:40 1776097060

Except that the mere existence of the market with the question posed for people to consider, probably activates the availability heuristic[1], causing people to overestimate the likelihood.

[1] https://philopedia.org/topics/availability-heuristic/

declan_roberts · 2026-04-13T17:55:10 1776102910

"Nothing ever happens"

faeyanpiraat · 2026-04-13T18:08:13 1776103693

Benjamin, huh?

throwaway2027 · 2026-04-13T03:27:22 1776050842

Thanks!

throwaway2027 · 2026-04-12T15:12:56 1776006776

I don't want a nudge. I want a clear RED WARNING with "You've gone away from your computer a bit too long and chatted too much at the coffee machine. You're better off starting a new context!"

senko · 2026-04-13T05:17:53 1776057473

I don’t want a scary red message chastising me for not being responsive enough!

I often leave CC hanging (or even suspended) and use /resume a lot. I’m okay with that having some negative effect on my token limits.

Product design is hard. They can’t please us all. I don’t envy the team considering these trade offs.

sharts · 2026-04-13T10:22:08 1776075728

Is it that hard though? This kinda smacks of no research on users prior to rolling stuff out.

bcherny · 2026-04-12T15:20:44 1776007244

Ack, it is currently blue but we can make it red

oezi · 2026-04-13T11:20:50 1776079250

I think after the TTL expires the session should be autocompacted and the user should given a choice to continue with compacted version or be hit with the full read cost of continuing with their large but expired context. At the moment users are blind what is going on.

SpaceNoodled · 2026-04-12T16:40:21 1776012021

Why is nobody even asking why that should be an issue? No other text editor shits the bed that way. The whole point of the computer is that it patiently waits for my input.

GeoAtreides · 2026-04-12T16:47:40 1776012460

let me put this way: not your ram, not your cache, not waiting patiently for your input.

SpaceNoodled · 2026-04-12T22:48:36 1776034116

Good thing they're not charging for it, then.

subscribed · 2026-04-13T01:57:09 1776045429

Good thing they didn't silently, quietly change cache from 1 hour to 5 minutes, right?

smrtinsert · 2026-04-13T06:32:27 1776061947

forget the warning, just compact like someone suggested in the ticket. Who would opt for a massive cache miss?

throwaway2027 · 2026-04-12T14:47:40 1776005260

Some claim that some of the recent smaller local models are as good as Sonnet 4.5 of last year and the bigger high-end models can be as almost as good as Claude, Gemini and Codex today, but some say they're benchmaxed and not representative.

To try things out you can use llama.cpp with Vulkan or even CPU and a small model like Gemma 4 26B-A4B or Gemma 4 31B or Qwen 3.5 35-A3B or Qwen3.5 27B. Some of the smaller quants fit within 16GB of GPU memory. The default people usually go with now is Q4_K_XL, a 4-bit quant for decent performance and size.

https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF

https://huggingface.co/unsloth/gemma-4-31B-it-GGUF

https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF

https://huggingface.co/unsloth/Qwen3.5-27B-GGUF

Get a second hand 3090/4090 or buy a new Intel Arc Pro B70. Use MoE models and offload to RAM for best bang for your buck. For speed try to find a model that fits entirely within VRAM. If you want to use multiple GPUs you might want to switch to vLLM or something else.

You can try any of the following models:

High-end: GLM 5.1, MiniMax 2.7

Medium: Gemma 4, Qwen 3.5

https://unsloth.ai/docs/models/minimax-m27

https://unsloth.ai/docs/models/glm-5.1

https://unsloth.ai/docs/models/gemma-4

https://unsloth.ai/docs/models/qwen3.5

https://github.com/ggml-org/llama.cpp

weavie · 2026-04-12T15:40:28 1776008428

Thank you, I'll look into it. For someone who is used to just working with second hand thinkpads, this stuff gets expensive fast!