Hacker Newsnew | past | comments | ask | show | jobs | submit | tehsauce's commentslogin

I was excited to try it out so I downloaded the repo and ran the build. However there were 100+ compilation errors. So I checked the commit history on github and saw that for at least several pages back all recent commits had failed in the CI. It was not clear which commit I should pick to get the semi-working version advertised.

I started looking in the Cargo.toml to at least get an idea how the project was constructed. I saw there that rather than being built from scratch as the post seemed to imply that almost every core component was simply pulled in from an open source library. quickjs engine, wgpu graphics, winit windowing & input, egui for ui, html parsing, the list goes on. On twitter their CEO explicitly stated that it uses a "custom js vm" which seemed particularly misleading / untrue to me.

Integrating all of these existing components is still super impressive for these models to do autonomously, so I'm just at a loss how to feel when it does something impressive but they then feel the need to misrepresent so much. I guess I just have a lot less respect and trust for the cursor leadership, but maybe a little relief knowing that soon I may just generate my own custom cursor!


WGPU for render, winit for window, servo css engine, taffy for layout sounds eerily similar to our existing open source Rust browser blitz.

https://github.com/dioxuslabs/blitz

Maybe we ended up in the training data!


I follow Dioxus and particularly blitz / #native on your Discord and I noticed the exact same thing too. There was a comment in a readme in Cursor's browser repo they linked mentioning taffy and I thought, hang on, it's definitely not from scratch, as they advertise. People really do believe everything they read on Twitter.

Great work by the way, blitz seems to be coming along nicely, and I even see you guys created a proto browser yourselves which is pretty cool, actually functional unlike Cursor's.


You are doing it wrong.

Take a screenshot and take it to your manager / investor and make a presentation “Imagine what is now possible for our business”.

Get promoted / exit, move to other pastures and let them figure it out.


Of 63295 workflow runs, apparently only 1426 have been successful.

It's hard to avoid the impression that this is an unverified pile of slop that may have actually never worked.

The CI process certainly hasn't succeeded for the vast majority of commits.

Baffling, really.


You should see the code. It's true slop. The organization makes no sense.

Thanks for the feedback. There were some build errors which have now been resolved; the CI test that was failing was not a standard check CI, and it's now been updated. Let me know if you have any further issues.

> On twitter their CEO explicitly stated that it uses a "custom js vm" which seemed particularly misleading / untrue to me.

The JS engine used a custom JS VM being developed in vendor/ecma-rs as part of the browser, which is a copy of my personal JS parser project vendored to make it easier to commit to.

I agree that for some core engine components, it should not be simply pulling in dependencies. I've begun the process of removing many of these and co-developing them within the repo alongside the browser. A reasonable goal for "from scratch" may be "if other major browsers use a dependency, it's fine to do so too". For example: OpenSSL, libpng, HarfBuzz, Skia. The current project can be moved more towards this direction, although I think using libraries for general infra that most software use (e.g. windowing) can be compatible with that goal.

I'd push back on the idea that all the agents did was wire up dependencies — the JS VM, DOM, paint systems, chrome, text pipeline, are all being developed as part of this project, and there are real complex systems being engineered towards the goal of a browser engine, even if not there yet.


When you say "have now been resolved" - did the AI agent resolve it autonomously, did you direct it to, or did a human do it?

Looks like Cursor Agent was at least somewhat involved: https://github.com/wilsonzlin/fastrender/commit/4cc2cb3cf0bd...

Looks like a bunch of different users (including Google's Jules made one commit) been contributing to the codebase, and the recent "fixes" includes switching between various git users. https://gist.github.com/embedding-shapes/d09225180ea3236f180...

This to me seems to raise more questions than it answers.


The ones at *.ec2.internal generally mean that the git config was never set up ans it defaults to $(id -un)@$(hostname)

Indeed. Extra observant people will notice that the "Ubuntu" username was used only twice though, compared to "root" that was used +3700 times. And observant people who've dealt with infrastructure before, might recognize that username as the default for interactive EC2 instances :)

Let us all generate our own custom cursors.

I love this! Your results seem comparable to the counter strike or minecraft models from a bit ago with massively less compute and data. It's particularly cool that it uses real world data. I've been wanting to do something like this for a while, like capturing a large dataset while backpacking in the cascades :)

I didn't see it in an obvious place on your github, do you have any plans to open source the training code?


There has been some good research published on this topic of how RLHF, ie aligning to human preferences easily introduces mode collapse and bias into models. For example, with a prompt like: "Choose a random number", the base pretrained model can give relatively random answers, but after fine tuning to produce responses humans like, they become very biased towards responding with numbers like "7" or "42".


I assume 42 is a joke from deep history and The Hitchhiker’s Guide. Pretty amusing to read the Wikipedia entry:

https://en.wikipedia.org/wiki/42_(number)


Douglas Adams picked 42 randomly though. :)


Not at all. It was derived mathematically from the Question: What do you get if you multiply six by nine?


It was just a joke, and doubly so the fact it "works" in base 13.

It was written as a joke in fairly ramshackle radio play. He had no idea at the time of writing it that the joke would connect so well and become it's own "thing" and dominate discourse of the radio series and novels to come.

It's not a joke about numbers, it's a linguistical joke, that works well on radio, something that HHGTG is stuffed full of.

https://scifi.stackexchange.com/questions/12229/how-did-doug...


That's not the question, though. Everybody knows that the question is the one posed to Mister Turtle and Mister Owl which neither of them can find the answer to.


I stand corrected in base 13.


It's very funny that people hold the autoregressive nature of LLMs against them, while being far more hardline autoregressive themselves. It's just not consciously obvious.


I wonder whether we hold LLMs to a different standard because we have a long term reinforced expectation for a computer to produce an exact result?

One of my first teachers said to me that a computer won't ever output anything wrong, it will produce a result according to the instructions it was given.

LLMs do follow this principle as well, it's just that when we are assessing the quality of output we are incorrectly comparing it to the deterministic alternative, and this isn't really a valid comparison.


I think people tend to just not understand what autoregressive methods are capable of doing generally (i.e., basically anything an alternative method can do), and worse they sort of mentally view it as equivalent to a context length of 1.


Why is that? Whenever I’m giving examples I almost always use 7, something ending in a 7 or something in the 70s


1 and 10 are on the boundary, that's not random so those are out.

5 is exactly halfway, that's not random enough either, that's out.

2, 4, 6, 8 are even and even numbers are round and friendly and comfortable, those are out too.

9 feels too close to the boundary, it's out.

That leaves 3 and 7, and 7 is more than 3 so it's got more room for randomness in it right?

Therefore 7 is the most random number between 1 and 10.


That's all well and good, but 4 is actually the most random number, because it was chosen by fair dice roll.


Also because humans are biased towards viewing prime numbers as more counterintuitive and thus more unpredictable.


Last time I hallway tested it, people couldn’t tell what prime numbers are, and to my surprise even the ones with tech/math-y background forgot it. My results were something 1.5/10 (ages 30+-5) and I didn’t go to cabinets where I knew there are zero chances.


But there's a difference between "knowing what the formal definition is" and "having a feeling that a number is somehow unique due to it's indivisibility".


The theory I've heard is that the more prime a number is, the more random it feels. 13 feels more awkward and weird, and it doesn't come up naturally as often as 2 or 3 do in everyday life. It's rare, so it must be more random! I'll give you the most random number I can think of!

People tend to avoid extremes, too. If you ask for a number between 1 and 10, people tend to pick something in the middle. Somehow, the ordinal values of the range seem less likely.

Additionally, people tend to avoid numbers that are in other ranges. Ask for a number from 1 to 100, and it just feels wrong to pick a number between 1 and 10. They asked for a number between 1 and 100. Not this much smaller range. You don't want to give them a number they can't use. There must be a reason they said 100. I wonder if the human RNG would improve if we started asking for numbers between 21 and 114.


People also tend to botch random sequences by trying to avoid repetition or patterns.


Okay, this is a nitpick, but I don't think ordinal can be used in that way. "Somehow, the ordinal values of the range seem less likely". I'd probably go with extremes of the range? Or endpoints?


Nope I just mixed up a rephrase. I originally said "ordinal extremes" and meant to say "extreme values". I replaced the wrong word.


Veritasium actually made a video on this concept about a year ago: https://www.youtube.com/watch?v=d6iQrh2TK98


My guess is that we bias towards numbers with cultural or personal significance. 7 is lucky in western cultures and is religiously significant (see https://en.wikipedia.org/wiki/7#Culture). 42 is culturally significant in science fiction, though that's a lot more recent. There are probably other examples, but I imagine the mean converges on numbers with multiple cultural touchpoints.


I have never heard of 7 being a lucky number in western culture and your link doesn't support that. 3 is a lucky number, 13 is an unlucky number, 7 is nothing to me.

So I don't think its that, 7 is still a very common "random number" here even though there is no special cultural significance to it.


Have you heard of Las Vegas? The 777 being the grand prize? Maybe it is not universal to all of western society but I have never before today heard of a culture where 3 was the lucky number. The USA’s culturally lucky number is absolutely 7.


I don't live in USA, the west includes Europe. 7 is maybe a lucky number in USA but not where I live. So I think that would be more of an American thing than a western thing maybe. Or maybe its related to some parts of Christianity but not others.


I’ve heard 7 as lucky all my life in the US and it’s mentioned in the Wikipedia page for 7. I think if you asked the average English-speaking American to name a number thought of as lucky most people would say 7.

Lucky Seven/Lucky Number Seven is also just a common phrase in American culture. There’s even a Wikipedia page of things called Lucky [Number] Seven. https://en.m.wikipedia.org/wiki/Lucky_7


>>I have never heard of 7 being a lucky number in western culture and your link doesn't support that. 3 is a lucky number, 13 is an unlucky number, 7 is nothing to me.

Some sustain that 7 is the God's number, stemming from "God created the world in seven days"[1]

Also, _"According to some, 777 represents the threefold perfection of the Trinity"_ [2]

[1] https://www.wikihow.com/What-Does-the-Number-7-Mean-in-the-B...

[2] https://en.wikipedia.org/wiki/777_(number)


It's definitely used in slot machines as a lucky number. Which came first I'm not sure (but I suspect from a sibling comment in the same thread it's based on perceived commonality and primeness historically and became "lucky" in the past because of that).


While I have never heard of someone referring to 7 as a lucky number, 7 is the most common sum of two rolled dice. So I can see how people would regard it as a lucky number. Along the same lines, I assume that someone who mentions 42 as a random number has at least some interest in science fiction.


You must be living under a rock if you’ve never heard of 7 as a lucky number.


I prefer to calculate with numbers, and don't pay much attention to superstitions around them. I don't gamble, nor much pay attention to conversations about gambling, so I pretty much ignore any mention of lucky numbers when such topics arise (aside from knowing that some people have lucky numbers). If you refer being isolated from a particular aspect of life living under a rock, so be it. Though I will point out that I like wide open space. I'm more of an astronomer than a geologist!


I have never heard of 7 being lucky until this thread.

I think you overestimate how cultural diverse western countries are when it comes to small things like this.


Hmm really? Even on the Wikipedia page for 7 (https://en.m.wikipedia.org/wiki/7), one of the first things it says is “7 is often considered lucky in Western culture and is often seen as highly symbolic.” And FWIW you can see the Wikipedia edit history, that isn’t a recent edit, nobody here is messing with it :)

“Lucky Number 7” is a common phrase, there was even a popular movie that played on this, “Lucky Number Slevin” (https://m.imdb.com/title/tt0425210/). It’s one of the first numbers I’d think of as a “lucky number.”


It's associated with Christ by several Protestant Christian denominations.


I like prime numbers. Non-primes always feel like they're about to fall apart on me.


Can you share any links about this?


They choose 37 =)


Which is weird, because I thought we'd all agreed that the random number was 4?

https://xkcd.com/221/


We have a shared community map where you can watch hundreds of agents from multiple peoples training runs playing in real time!

https://pwhiddy.github.io/pokerl-map-viz/


That's amazing. Really awesome work.


Can you make a twitch stream of a single agent playing?


Wouldn't make much sense. We generally train with 288 environments simultaneously. I've been thinking about ways to nicely stream all 288 environments though.


It's impossible to beat with random actions or brute force, but you can get surprisingly far. It doesn't take too long to get halfway through route 1, but even with insane compute you'll never make it even to viridian forest.


Anyone interested in watching lots of reinforcement agents playing pokemon red at once, we have a website which streams hundreds of concurrent games from multiple people’s training runs to a shared map in real time!

https://pwhiddy.github.io/pokerl-map-viz/

(works best on desktop)


the metal backend does currently generate quite a lot of unnecessary command buffers, but in general performance seems solid.


I haven’t gone through the paper in detail yet but maybe someone can answer. If you remove the hidden state from an rnn as they say they’ve done, what’s left? An mlp predicting from a single token?


They didn't remove the hidden state entirely, they just removed it from the input, forget and update gates. I haven't digested the paper either, but I think that in the case of a GRU this means that the hidden state update masking (z_t and r_t in the paper's formulas) only depends on the new input, not the input plus the prior hidden state.


It doesn't completely remove it, it removes certain dependencies on it so that it can be computed by parallel scan, there is still a hidden state. It bears some similarity to what was done with Mamba.


I only had a quick look, but it looks like they tweaked the state update so the model can be run with parallel scan instead of having to do it sequentially.


The trick is to make sure the recursive dependency stays linear, that's how you enable parallel training.


The water consumed to produce a single hamburger is over 2000 liters, and the power likely well over 100 watt-hours.

That means gpt can write >1000 emails using the resources of feeding a single person lunch. The resource efficiency of these machines already is really quite astonishing.


Awesome article! Something slightly misleading though - the first image shows the intersection of a non-convex shape, but it isn't revealed until much later that the algorithm only works for convex shapes, not the type shown in the first image.


It is discussed that the algorithm handles non-convex shapes by breaking them into convex shapes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: