Hacker Newsnew | past | comments | ask | show | jobs | submit | bazzargh's commentslogin

I was doing something related for genuary recently, but with different constraints - I wanted to make a photgraph look painted in a fauvist style, with visible brushstrokes (and a vivid, unrealistic colour scheme). When you're overlaying many, many strokes like that you end up hiding the errors but also hiding the digital brushtrokes; with very few strokes, placement becomes important.

Instead of just stroking with the average colour, I chose to only connect points that were similar in colour (because of the fauvist thing, I was mainly interested in hue; the paints are all intense). By doing this, I was trying to be edge preserving, without explicitly calculation, eg of the Sobel operator.

It kinda worked in that the edges came out clearly; the resulting painting was messy tho; the thick brushstrokes and the colours are intentional, but the brushstrokes going in random directions is not https://media.hachyderm.io/media_attachments/files/115/894/8... compare to an actual fauvist work, https://en.wikipedia.org/wiki/Robert_Delaunay#/media/File:Ro... which still has the 'dithered' look to the face but the strokes are deliberately aligned, I'd fix that if I try again. (Delaunay also uses small strokes for detail, a thing I wasn't going to try in a throwaway program)

An earlier attempt at generating pencil sketches from photos - again, keeping the number of strokes small, and using parallel strokes for hatching - worked much better https://media.hachyderm.io/media_attachments/files/112/767/4... (it's just sobel to find edges and a bit of sampling with a filter to decide where to shade)


You would find things in there that were already close to QM and relativity. The Michelson-Morley experiment was 1887 and Lorentz transformations came along in 1889. The photoelectric effect (which Einstein explained in terms of photons in 1905) was also discovered in 1887. William Clifford (who _died_ in 1889) had notions that foreshadowed general relativity: "Riemann, and more specifically Clifford, conjectured that forces and matter might be local irregularities in the curvature of space, and in this they were strikingly prophetic, though for their pains they were dismissed at the time as visionaries." - Banesh Hoffmann (1973)

Things don't happen all of a sudden, and being able to see all the scientific papers of the era its possible those could have fallen out of the synthesis.


I presume that's what the parent post is trying to get at? Seeing if, given the cutting edge scientific knowledge of the day, the LLM is able to synthesis all it into a workable theory of QM by making the necessary connections and (quantum...) leaps

Standing on the shoulders of giants, as it were


But that's not the OP's challenge, he said "if the model comes up with anything even remotely correct." The point is there were things already "remotely correct" out there in 1900. If the LLM finds them, it wouldn't "be quite a strong evidence that LLMs are a path to something bigger."

It's not the comment which is illogical, it's your (mis)interpretation of it. What I (and seemingly others) took it to mean is basically could an LLM do Einstein's job? Could it weave together all those loose threads into a coherent new way of understanding the physical world? If so, AGI can't be far behind.

This alone still wouldn't be a clear demonstration that AGI is around the corner. It's quite possible a LLM could've done Einstein's job, if Einstein's job was truly just synthesising already available information into a coherent new whole. (I couldn't say, I don't know enough of the physics landscape of the day to claim either way.)

It's still unclear whether this process could be merely continued, seeded only with new physical data, in order to keep progressing beyond that point, "forever", or at least for as long as we imagine humans will continue to go on making scientific progress.


Einstein is chosen in such contexts because he's the paradigmatic paradigm-shifter. Basically, what you're saying is: "I don't know enough history of science to confirm this incredibly high opinion on Einstein's achievements. It could just be that everyone's been wrong about him, and if I'd really get down and dirty, and learn the facts at hand, I might even prove it." Einstein is chosen to avoid exactly this kind of nit-picking.

They can also choose Euler or Gauss.

These two are so above everyone else in the mathematical world that most people would struggle for weeks or even months to understand something they did in a couple of minutes.

There's no "get down and dirty" shortcut with them =)


No, by saying this, I am not downplaying Einstein's sizeable achievements nor trying to imply everyone was wrong about him. His was an impressive breadth of knowledge and mathematical prowess and there's no denying this.

However, what I'm saying is not mere nitpicking either. It is precisely because of my belief in Einstein's extraordinary abilities that I find it unconvincing that an LLM being able to recombine the extant written physics-related building blocks of 1900, with its practically infinite reading speed, necessarily demonstrates comparable capabilities to Einstein.

The essence of the question is this: would Einstein, having been granted eternal youth and a neverending source of data on physical phenomena, be able to innovate forever? Would an LLM?

My position is that even if an LLM is able to synthesise special relativity given 1900 knowledge, this doesn't necessarily mean that a positive answer to the first question implies a positive answer to the second.


I'm sorry, but 'not being surprised if LLMs can rederive relativity and QM from the facts available in 1900' is a pretty scalding take.

This would absolutely be very good evidence that models can actually come up with novel, paradigm-shifting ideas. It was absolutely not obvious at that time from the existing facts, and some crazy leap of faiths needed to be taken.

This is especially true for General Relativity, for which you had just a few mismatch in the mesurements like Mercury's precession, and where the theory almost entirely follows from thought experiments.


Isn't it an interesting question? Wouldn't you like to know the answer? I don't think anyone is claiming anything more than an interesting thought experiment.

This does make me think about Kuhn's concept of scientific revolutions and paradigms, and that paradigms are incommensurate with one another. Since new paradigms can't be proven or disproven by the rules of the old paradigm, if an LLM could independently discover paradigm shifts similar to moving from Newtonian gravity to general relativity, then we have empirical evidence of an LLM performing a feature of general intelligence.

However, you could also argue that it's actually empirical evidence that general relativity and 19th century physics wasn't truly a paradigm shift -- you could have 'derived' it from previous data -- that the LLM has actually proven something about structurally similarities between those paradigms, not that it's demonstrating general intelligence...


His concept sounds odd. There will always be many hints of something yet to be discovered, simply by the nature of anything worth discovering having an influence on other things.

For instance spectroscopy enables one to look at the spectra emitted by another 'thing', perhaps the sun, and it turns out that there's little streaks within the spectra the correspond directly to various elements. This is how we're able to determine the elemental composition of things like the sun.

That connection between elements and the patterns in their spectra was discovered in the early 1800s. And those patterns are caused by quantum mechanical interactions and so it was perhaps one of the first big hints of quantum mechanics, yet it'd still be a century before we got to relativity, let alone quantum mechanics.


You should read it

I mean, "the pieces were already there" is true of everything? Einstein was synthesizing existing math and existing data is your point right?

But the whole question is whether or not something can do that synthesis!

And the "anyone who read all the right papers" thing - nobody actually reads all the papers. That's the bottleneck. LLMs don't have it. They will continue to not have it. Humans will continue to not be able to read faster than LLMs.

Even me, using a speech synthesizer at ~700 WPM.


> I mean, "the pieces were already there" is true of everything? Einstein was synthesizing existing math and existing data is your point right?

If it's true of everything, then surely having an LLM work iteratively on the pieces, along with being provided additional physical data, will lead to the discovery of everything?

If the answer is "no", then surely something is still missing.

> And the "anyone who read all the right papers" thing - nobody actually reads all the papers. That's the bottleneck. LLMs don't have it. They will continue to not have it. Humans will continue to not be able to read faster than LLMs.

I agree with this. This is a definitive advantage of LLMs.


Einstein is not AGI, and neither the other way around.

AGI is human level intelligence, and the minimum bar is Einstein?

Who said anything of a minimum bar? "If so", not "Only if so".

Actually it's worse than that, the comment implied that Einstein wouldn't even qualify for AGI. But I thought the conversation was pedantic enough without my contribution ;)

I think the problem is the formulation "If so, AGI can't be far behind". I think that if a model were advanced enough such that it could do Einstein's job, that's it; that's AGI. Would it be ASI? Not necessarily, but that's another matter.

The phone in your pocket can perform arithmetic many orders of magnitude faster than any human, even the fringe autistic savant type. Yet it's still obviously not intelligent.

Excellence at any given task is not indicative of intelligence. I think we set these sort of false goalposts because we want something that sounds achievable but is just out of reach at one moment in time. For instance at one time it was believed that a computer playing chess at the level of a human would be proof of intelligence. Of course it sounds naive now, but it was genuinely believed. It ultimately not being so is not us moving the goalposts, so much as us setting artificially low goalposts to begin with.

So for instance what we're speaking of here is logical processing across natural language, yet human intelligence predates natural language. It poses a bit of a logical problem to then define intelligence as the logical processing of natural language.


The problem is that so far, SOTA generalist models are not excellent at just one particular task. They have a very wide range of tasks they are good at, and good scores in one particular benchmarks correlates very strongly with good scores in almost all other benchmarks, even esoteric benchmarks that AI labs certainly didn't train against.

I'm sure, without any uncertainty, that any generalist model able to do what Einstein did would be AGI, as in, that model would be able to perform any cognitive task that an intelligent human being could complete in a reasonable amount of time (here "reasonable" depends on the task at hand; it could be minutes, hours, days, years, etc).


I see things rather differently. Here's a few points in no particular order:

(1) - A major part of the challenge is in not being directed towards something. There was no external guidance for Einstein - he wasn't even a formal researcher at the time of his breakthroughs. An LLM might be able to be handheld towards relativity, though I doubt it, but given the prompt of 'hey find something revolutionary' it's obviously never going to respond with anything relevant, even with substantially greater precision specifying field/subtopic/etc.

(2) - Logical processing of natural language remains one small aspect of intelligence. For example - humanity invented natural language from nothing. The concept of an LLM doing this is a nonstarter since they're dependent upon token prediction, yet we're speaking of starting with 0 tokens.

(3) - LLMs are, in many ways, very much like calculators. They can indeed achieve some quite impressive feats in specific domains, yet then they will completely hallucinate nonsense on relatively trivial queries, particularly on topics where there isn't extensive data to drive their token prediction. I don't entirely understand your extreme optimism towards LLMs given this proclivity for hallucination. Their ability to produce compelling nonsense makes them particularly tedious for using to do anything you don't already effectively know the answer to.


> I don't entirely understand your extreme optimism towards LLMs given this proclivity for hallucination

Simply because I don't see hallucinations as a permanent problem. I see that models keep improving more and more in this regard, and I don't see why the hallucination rate can't be abirtrarily reduced with further improvements to the architecture. When I ask Claude about obscure topics, it correctly replies "I don't know", where past models would have hallucinated an answer. When I use GPT 5.2-thinking for my ML research job, I pretty much never encounter hallucinations.


Hahah, well you working in the field probably explains your optimism more than your words! If you pretty much never encounter hallucinations with GPT then you're probably dealing with it on topics where there's less of a right or wrong answer. I encounter them literally every single time I start trying to work out a technical problem with it.

Well the "prompt" in this case would be Einstein's neurotype and all his life experiences. Might a bit long for the current context windows though ;)

LLMs don't make inferential leaps like that

I think it's not productive to just have the LLM site like Mycroft in his armchair and from there, return you an excellent expert opinion.

THat's not how science works.

The LLM would have to propose experiments (which would have to be simulated), and then develop its theories from that.

Maybe there had been enough facts around to suggest a number of hypotheses, but the LLM in its curent form won't be able to confirm them.


Yeah but... we still might not know if it could do that because we were really close by 1900 or if the LLM is very smart.

What's the bar here? Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

I by no means believe LLMs are general intelligence, and I've seen them produce a lot of garbage, but if they could produce these revolutionary theories from only <= year 1900 information and a prompt that is not ridiculously leading, that would be a really compelling demonstration of their power.


> Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

It turns out my reading is somewhat topical. I've been reading Rhodes' "The Making of the Atomic Bomb" and of the things he takes great pains to argue (I was not quite anticipating how much I'd be trying to recall my high school science classes to make sense of his account of various experiments) is that the development toward the atomic bomb was more or less inexorable and if at any point someone said "this is too far; let's stop here" there would be others to take his place. So, maybe, to answer your question.


It’s been a while since I read it, but I recall Rhodes’ point being that once the fundamentals of fission in heavy elements were validated, making a working bomb was no longer primarily a question of science, but one of engineering.

Engineering began before they were done with the experimentation and theorizing part. But the US, the UK, France, Germany, the Soviets, and Japan all had nuclear weapons programs with different degrees of success.

> Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?

Yes. It is certainly a question if Einstein is one of the smartest guy ever lived or all of his discoveries were already in the Zeitgeist, and would have been discovered by someone else in ~5 years.


Both can be true?

Einstein was smart and put several disjointed things together. It's amazing that one person could do so much, from explaining the Brownian motion to explaining the photoeffect.

But I think that all these would have happened within _years_ anyway.


> Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

Kind of, how long would it have realistically taken for someone else (also really smart) to come up with the same thing if Einstein wouldn't have been there?


But you're not actually questioning whether he was "really smart". Which was what GP was questioning. Sure, you can try to quantify the level of smarts, but you can't still call it a "stochastic parrot" anymore, just like you won't respond to Einstein's achievements, "Ah well, in the end I'm still not sure he's actually smart, like I am for example. Could just be that he's just dumbly but systematically going through all options, working it out step by step, nothing I couldn't achieve (or even better, program a computer to do) if I'd put my mind to it."

I personally doubt that this would work. I don't think these systems can achieve truly ground-breaking, paradigm-shifting work. The homeworld of these systems is the corpus of text on which it was trained, in the same way as ours is physical reality. Their access to this reality is always secondary, already distorted by the imperfections of human knowledge.


Well, we know many watershed moments in history were more a matter of situation than the specific person - an individual genius might move things by a decade or two, but in general the difference is marginal. True bolt-out-of-the-blue developments are uncommon, though all the more impressive for that fact, I think.

Well, if one had enough time and resources, this would make for an interesting metric. Could it figure it out with cut-off of 1900? If so, what about 1899? 1898? What context from the marginal year was key to the change in outcome?

It's only easy to see precursors in hindsight. The Michelson-Morley tale is a great example of this. In hindsight, their experiment was screaming relativity, because it demonstrated that the speed of light was identical from two perspectives where it's very difficult to explain without relativity. Lorentz contraction was just a completely ad-hoc proposal to maintain the assumptions of the time (luminiferous aether in particular) while also explaining the result. But in general it was not seen as that big of a deal.

There's a very similar parallel with dark matter in modern times. We certainly have endless hints to the truth that will be evident in hindsight, but for now? We are mostly convinced that we know the truth, perform experiments to prove that, find nothing, shrug, adjust the model to be even more esoteric, and repeat onto the next one. And maybe one will eventually show something, or maybe we're on the wrong path altogether. This quote, from Michelson in 1894 (more than a decade before Einstein would come along), is extremely telling of the opinion at the time:

"While it is never safe to affirm that the future of Physical Science has no marvels in store even more astonishing than those of the past, it seems probable that most of the grand underlying principles have been firmly established and that further advances are to be sought chiefly in the rigorous application of these principles to all the phenomena which come under our notice. It is here that the science of measurement shows its importance — where quantitative work is more to be desired than qualitative work. An eminent physicist remarked that the future truths of physical science are to be looked for in the sixth place of decimals." - Michelson 1894


With the passage of time more and more things have been discovered through precision. Through identifying small errors in some measurement and pursuing that to find the cause.

It's not precision that's the problem, but understanding when something has been falsified. For instance the Lorentz transformations work as a perfectly fine ad-hoc solution to Michelson's discovery. All it did was make the aether a bit more esoteric in nature. Why do you then not simply shrug, accept it, and move on? Perhaps even toss some accolades towards Lorentz for 'solving' the puzzle? Michelson himself certainly felt there was no particularly relevant mystery outstanding.

For another parallel our understanding of the big bang was, and probably is, wrong. There are a lot of problems with the traditional view of the big bang with the horizon problem [1] being just one among many - areas in space that should not have had time to interact behave like they have. So this was 'solved' by an ad hoc solution - just make the expansion of the universe go into super-light speed for a fraction of a second at a specific moment, slow down, then start speeding up again (cosmic inflation [2]) - and it all works just fine. So you know what we did? Shrugged, accepted it, and even gave Guth et al a bunch of accolades for 'solving' the puzzle.

This is the problem - arguably the most important principle of science is falsifiability. But when is something falsified? Because in many situations, probably the overwhelming majority, you can instead just use one falsification to create a new hypothesis with that nuance integrated into it. And as science moves beyond singular formulas derived from clear principles or laws and onto broad encompassing models based on correlations from limited observations, this becomes more and more true.

[1] - https://en.wikipedia.org/wiki/Horizon_problem

[2] - https://en.wikipedia.org/wiki/Cosmic_inflation


This would still be valuable even if the LLM only finds out about things that are already in the air.

It’s probably even more of a problem that different areas of scientific development don’t know about each other. LLMs combining results would still not be like they invented something new.

But if they could give us a head start of 20 years on certain developments this would be an awesome result.


Then that experiment is even more interesting, and should be done.

My own prediction is that the LLMs would totally fail at connecting the dots, but a small group of very smart humans can.

Things don't happen all of a sudden, but they also don't happen everywhere. Most people in most parts of the world would never connect the dots. Scientific curiosity is something valuable and fragile, that we just take for granted.


One of the reasons they don’t happen everywhere is because there are just a few places at any given point in time where there are enough well connected and educated individuals who are in a position to even see all the dots let alone connect them. This doesn’t discount the achievement of an LLM also manages to, but I think it’s important to recognise that having enough giants in sight is an important prerequisite to standing on their shoulders

If (as you seem to be suggesting) relativity was effectively lying there on the table waiting for Einstein to just pick it up, how come it blindsided most, if not quite all, of the greatest minds of his generation?

That's the case with all scientific discoveries - pieces of prior work get accumulated, until it eventually becomes obvious[0] how they connect, at which point someone[1] connects the dots, making a discovery... and putting it on the table, for the cycle to repeat anew. This is, in a nutshell, the history of all scientific and technological progress. Accumulation of tiny increments.

--

[0] - To people who happen to have the right background and skill set, and are in the right place.

[1] - Almost always multiple someones, independently, within short time of each other. People usually remember only one or two because, for better or worse, history is much like patent law: first to file wins.


Science often advances by accumulation, and it’s true that multiple people frequently converge on similar ideas once the surrounding toolkit exists. But “it becomes obvious” is doing a lot of work here, and the history around relativity (special and general) is a pretty good demonstration that it often doesn’t become obvious at all, even to very smart people with front-row seats.

Take Michelson in 1894: after doing (and inspiring) the kind of precision work that should have set off alarm bells, he’s still talking like the fundamentals are basically done and progress is just “sixth decimal place” refinement.

"While it is never safe to affirm that the future of Physical Science has no marvels in store even more astonishing than those of the past, it seems probable that most of the grand underlying principles have been firmly established and that further advances are to be sought chiefly in the rigorous application of these principles to all the phenomena which come under our notice. It is here that the science of measurement shows its importance — where quantitative work is more to be desired than qualitative work. An eminent physicist remarked that the future truths of physical science are to be looked for in the sixth place of decimals." - Michelson 1894

The Michelson-Morley experiments weren't obscure, they were famous, discussed widely, and their null result was well-known. Yet for nearly two decades, the greatest physicists of the era proposed increasingly baroque modifications to existing theory rather than question the foundational assumption of absolute time. These weren't failures of data availability or technical skill, they were failures of imagination constrained by what seemed obviously true about the nature of time itself.

Einstein's insight wasn't just "connecting dots" here, it was recognizing that a dot everyone thought was fixed (the absoluteness of simultaneity) could be moved, and that doing so made everything else fall into place.

People scorn the 'Great Man Hypothesis' so much they sometimes swing too much in the other direction. The 'multiple discovery' pattern you cite is real but often overstated. For Special Relativity, Poincaré came close, but didn't make the full conceptual break. Lorentz had the mathematics but retained the aether. The gap between 'almost there' and 'there' can be enormous when it requires abandoning what seems like common sense itself.


Sure - and climbing a mountain is just putting one foot down higher than it was before and repeating, once you abstract away all the hard parts.

It is. If you're at the mountain, on the right trail, and have the right clothing and equipment for the task.

That's why those tiny steps of scientific and technological progress aren't made by just any randos - they're made by people who happen to be at the right place and time, and equipped correctly to be able to take the step.

The important corollary to this is that you can't generally predict this ahead of time. Someone like Einstein was needed to nail down relativity, but standing there few years earlier, you couldn't have predicted it was Einstein who would make a breakthrough, nor what would that be about. Conversely, if Einstein lived 50 years earlier, he wouldn't have come up with relativity, because necessary prerequisites - knowledge, people, environment - weren't there yet.


You are describing hiking in the mountains, which doesn’t generalize to mountaineering and rock-climbing when it gets difficult, and the difficulties this view is abstracting away are real.

Your second and third paragraphs are entirely consistent with the original point I was trying to make, which was not that it took Einstein specifically to come up with relativity, but that it took someone with uncommon skills, as evidenced by the fact that it blindsided even a good many of the people who were qualified to be contenders for being the one to figure it out first. It does not amount to proof, but one does not expect people who are closing in on the solution to be blindsided by it.

I am well aware of the problems with “great man” hagiography, but dismissing individual contributions, which is what the person I was replying to seemed to be doing, is a distortion in its own way.


With LLMs the synthesis cycles could happen at a much higher frequency. Decades condensed to weeks or days?

I imagine possible buffers on that conjecture synthesis being epxerimentation and acceptance by the scientific community. AIs can come up with new ideas every day but Nature won't publish those ideas for years.


I agree, but it's important to note that QM has no clear formulation until 2025/6, it's like 20 years more of work than SR.

2025/6?

* 1925/6, sorry, bad century.

They were close, but it required the best people bashing their heads against each other for years until they got it.

That is the point.

New discoveries don’t happen in a vacuum.


You can get pretty far by modeling only frictionless, spherical discoveries in a vacuum.

It's been shared here many times, but Terence Eden has a great anecdote about how the UK's GDS standards - lightweight, simple html - meant the site was usable even on a crappy PSP https://shkspr.mobi/blog/2021/01/the-unreasonable-effectiven...


It looks like you have typos? (x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y))) reduces to x^2+y^2+( (x/y) / (x^2/y^2 + 1) ) - not the equation given? Tho it's easier to see that this would be symmetrical if you rearrange it to: x^2+y^2+( (xy) / (x^2+y^2) )

Also, if f(x,y) = x^2+y^2+( (x/y) / (x^2+y^2) ) then f(2,1) is 5.2 and f(1,2) is 5.1? - this is how I noticed the mistake. (the other reduction gives the same answer, 5.4, for both, by symmetry, as you suggest)

There's a simpler solution which produces integer ids (though they are large): 2^x & 2^y. Another solution is to multiply the xth and yth primes.

I only looked because I was curious how you proved it unique!


Hhhhmm. Ok. So I invented this solution in 2009 at what you might call a "peak mental moment", by a pool in Palm Springs, CA, after about 6 hours of writing on napkins. I'm not a mathematician. I don't think I'm even a great programmer, since there are probably much better ways of solving the thing I was trying to solve. And also, I'm not sure how I even came up with the reduction; I probably was wrong or made a typo (missing the +1?), and I'm not even certain how I could come up with it again.

2^x & 2^y ...is the & a bitwise operator...???? That would produce a unique ID? That would be very interesting, is that provable?

Primes take too much time.

The thing I was trying to solve was: I had written a bitcoin poker site from scratch, and I wanted to determine whether any players were colluding with each other. There were too many combinations of players on tables to analyze all their hands versus each other rapidly, so I needed to write a nightly cron job that collated their betting patterns 1 vs 1, 1 vs 2, 1 vs 3... any time 2 or 3 or 4 players were at the same table, I wanted to have a unique signature for that combination of players, regardless of which order they sat in at the table or which order they played their hands in. All the data for each player's action was in a SQL table of hand histories, indexed by playerID and tableID, with all the other playerIDs in the hand in a separate table. At the time, at least, I needed a faster way to query that data so that I could get a unique id from a set of playerIDs that would pull just the data from this massive table where all the same players were in a hand, without having to check the primary playerID column for each one. That was the motivation behind it.

It did work. I'm glad you were curious. I think I kept it as the original algorithm, not the reduced version. But I was much smarter 15 years ago... I haven't had an epiphany like that in awhile (mostly have not needed to, unfortunately).


The typo is most likely the extra /, in (x/y)/(x^2+y^2) instead of (xy)/(x^2+y^2).

`2^x & 2^y ...is the & a bitwise operator...???? That would produce a unique ID? That would be very interesting, is that provable?`

Yes, & is bitwise and. It's just treating your players as a bit vector. It's not so much provable as a tautology, it is exactly the property that players x and y are present. It's not _useful_ tho because the field size you'd need to hold the bit vector is enormous.

As for the problem...it sounds bloom-filter adjacent (a bloom filter of players in a hand would give a single id with a low probability of collision for a set of players; you'd use this to accelerate exact checks), but also like an indexed many-to-many table might have done the job, but all depends on what the actual queries you needed to run were, I'm just idly speculating.


At the time, at least, there was no way to index it for all 8 players involved in a hand. Each action taken would be indexed to the player that took it, and I'd need to sweep up adjacent actions for other players in each hand, but only the players who were consistently in lots of hands with that player. I've heard of bloom filters (now, not in 2012)... makes some sense. But the idea was to find some vector that made any set of players unique when running through a linear table, regardless of the order they presented in.

To that extent, I submit my solution as possibly being the best one.

I'm still a bit perplexed by why you say 2^x & 2^y is tautologically sound as a unique way to map f(x,y)==f(y,x), where x and y are nonequal integers. Throwing in the bitwise & makes it seem less safe to me. Why is that provably never replicable between any two pairs of integers?


I'm saying it's a tautology because it's just a binary representation of the set. Suppose we have 8 players, with x and y being 2 and 4: set the 2nd and 4th bits (ie 2^2 & 2^4) and you have 00001010.

But to lay it out: every positive integer is a sum of powers of 2. (this is obvious, since every number is a sum of 1s, ie 2^0). But also every number is a sum of _distinct_ powers of 2: if there are 2 identical powers 2^a+2^a in the sum, then they are replaced by 2^(a+1), this happens recursively until there are no more duplicated powers of 2.

It remains to show that each number has a unique binary representation, ie that there are no two numbers x=2^x1+2^x2+... and y=2^y1+2^y2+... that have the same sum, x=y, but from different powers. Suppose we have a smallest such number, and x1 y1 are the largest powers in each set. Then x1 != y1 because then we can subtract it from both numbers and get an _even smaller_ number that has distinct representations, a contradiction. Then either x1 < y1 or y1 < x1. Suppose without loss of generality that it's the first (we can just swap labels). then x<=2^(x1+1)-1 (just summing all powers of 2 from 1..x1) but y>=2^y1>=2^(x1+1)>x, a contradiction.

or, tl;dr just dealing with the case of 2 powers: we want to disprove that there exists a,b,c,d such that

2^a + 2^b = 2^c + 2^d, a>b, c>d, and (a,b) != (c,d).

Suppose a = c, then subtract 2^a from both sides and we have 2^b = 2^d, so b=d, a contradiction.

Suppose a>c; then a >= c+1.

2^c + 2^d < 2^c + 2^c = 2^(c+1).

so

2^c + 2^d <= 2^(c+1) - 1 < 2^(c+1) + 2^b <= 2^a + 2^b

a contradiction.


Thanks for the great response. Honestly, TIL that 2^0 = 1. That was a new one for me and I'm not sure I understand it. I failed pre-Calculus, twice.

Visually I think I can understand the bitwise version now, from reading this. But it wouldn't work for 3 integers, would it?


it works for any number of integers. The first proof above (before tl;dr) is showing that every positive integer has a unique representation as a sum of distinct powers of 2, ie binary, and that no two integers have the same representation. You can watch a lecture about the representation of sets in binary here https://www.youtube.com/watch?v=Iw21xgyN9To (google representing sets with bits for way more like this)

But again it's not useful in practice for very sparse sets: if you have say a million players, with at most 10 at the same poker table, setting 10 bits of a million-bit binary number is super wasteful. Even representing the players as fixed size 20-bit numbers (1 million in binary is 20 bits long), and appending the 10 sorted numbers, means you don't need more than 200 bits to represent this set.

And you can go much smaller if all you want is to label a _bucket_ that includes this particular set; just hash the 10 numbers to get a short id. Then to query faster for a specific combination of players you construct the hash of that group, query to get everything in that bucket (which may include false positives), then filter this much smaller set of answers.


BTW, yet another way to do it (more compact than the bitwise and prime options) is the Cantor pairing function https://en.wikipedia.org/wiki/Pairing_function

... z = (x+y+1)(x+y)/2 + y - but you have to sort x,y first to get the order independence you wanted. This function is famously used in the argument that the set of integers and the set of rationals have the same cardinality.


mm. I did see this when I was figuring it out. The sorting first was the specific thing I wanted to avoid, because it would've been by far the most expensive part of the operation when looking at a million poker hands and trying to target several players for potential collusion.


you're only sorting players within a single hand. so a list of under 10 items? thats trivial


So the goal was to generate signatures for 2, 3 or more players and then be able to reference anything in the history table that had that combination of players without doing a full scan and cross-joining the same table multiple times. Specifically to avoid having ten index columns in the history table for each seat's player. This was also prior to JSON querying in mysql. I needed a way to either bake in the combinations at write time, or to generate a unique id at read time in a way that wouldn't require me to query whether playerIDs were [1201,1803,2903] or [1803,1201,2903] etc. Just a one-shot unique signature for that combination of players that could always evaluate the same regardless of the order. If that makes sense. There were other considerations and this was not exactly how it worked, since only certain players were flagged and I was looking for patterns when those particular players were on the same table. It wasn't like every combination of players had a unique id, just a few combinations where I needed to be able to search over a large space to find when they were in the same room together, but disregarding the order they were listed in.


The boids in this demo form smaller flocks (both tighter, and fewer individuals) than other implementations I've seen. I had a look at the code and I'm not sure it's "right"? (I know the whole thing is subjective, I mean it doesn't follow the original)

In no particular order:

- the original boids code didn't cap the magnitude of acceleration after adding all the contributions; it added the contributions _by priority_, starting with separation, and if the max acceleration was exceeded it didn't add the others

- the max acceleration was decided by adding the _magnitudes_ not vectors of components, so the cohesion vector and the separation vector wouldn't cancel out - separation would win. I think this is why both this and the p5js one form very tight clumps which later "explode". That's this bit of the code (from https://www.red3d.com/cwr/code/boids.lisp):

    ;;
    ;; Available-acceleration should probably be 1.0, but I've set it a little higher to
    ;; avoid have to readjust the weighting factors for all of the acceleration requests.
    ;;
    (vlet* ((composite-acceleration (prioritized-acceleration-allocation 1.3 ;; 1.0
          avoid-obstacles
          avoid-flockmates
          velocity-matching
          centering-urge
          migratory-urge
          course-leveling
          course-damping))
- this implementation, unlike the p5js version it's based on, caps the acceleration _twice_ - before adding the contributions and after https://github.com/LauJensen/practical-quadtree/blob/7f5bdea... (this is the 'after' bit)

- the original had different radii for different components (the separation threshold was 4, the cohesion, alignment thresholds were 5)

- both the clojure and p5js versions use the same strength for cohesion, separation, and alignment. In the original, separation is much stronger (1, vs 0.3 for cohesion and 0.23 for alignment). Again this might explain the tight clumps.

I've not yet mucked with the rules to see if the behaviour recovers, but the p5js version makes it easy to hack on https://editor.p5js.org/pattvira/sketches/v_tmN-BC5 - as a first change, in draw() in sketch.js change the print statement to this:

    print(frameRate(), boids.length);
    if (frameRate()>50) {
      boids.push(new Boid(random(width), random(height)));    
    } else if (frameRate() < 40) {
      boids.pop()
    }
... and the two loops below it to use 'boids.length' not 'num'. Then the thing will dynamically adjust the number of boids to give you an acceptable framerate.

Aside: both the p5 and clojure versions do preserve the typo of 'seperation' from the Craig Reynold's code tho ;) ... I have to wonder if that's like 'referer' and now we have to spell it that way in a boids context.


Thank you - I was just about to point out some of that.

The reason that the flocks are tight is because the separation "force" is normally computed as a repulsion between a target boid and all other nearby boids individually, not vs. the center of mass of all nearby boids.


On the subject of whisper being great... A few weeks ago a co-worker commented about the difficulty he'd had editing a work demo, I pointed at various jump-cutting tools that had automated what he did in the past (editing out silences). But I'd also wanted to play with whisper for a while...

So a couple of hours later I'd written a script that does transcription based editing: on the first pass it grabs a timestamped transcript and a plain text transcript for editing; you edit the words into any order you like and a second pass reassembles the video (it's just a couple of hundred lines of python wrapping whisper and ffmpeg). It also speeds up 4x any silences detected that sit within retained sequences in the video.

Matching up transcripts turns out to be not that hard; I normalise the text, split it, and then compare to the sequence of normalised words from the timestamped transcript. I find the longest common sequence, keep that, then recurse on the before/after sections (there's a little more detail, but not much). I also sent the transcription to ffmpeg to burn in as captions, because sometimes it makes the audio choppy and the captions make it easier to follow.

I know, tools have been doing this for years now. I just didn't have one to hand, and now I do, and I couldn't have done this without whisper.


That is absolutely awesome and I love hearing about the tools that people build themselves!

Honestly, the capabilities of whisper is insane, the fact that it's free and open source is really a gift. Some of the things it can do feels almost sci-fi.

If you ever decide to release it publicly please let me know, sounds like a very useful tool.


"release" is maybe too strong a word, it's not a lot of code and I don't plan to put any more effort into the nonexistent interface since it was just built for personal use. But the code:

https://gist.github.com/bazzargh/e1d2e2718af575a03206114a291...


This is very kind of you, thanks.


I spent a few hours editing a video in Davicni resolve to do this by hand. Then i found out this is a built in feature.


That's an odd take. Teams doesn't have the leading market share in videoconferencing, Zoom does. I can't judge what it's like because I've never yet had to use Teams - not a single company that we deal with uses it, it's all Zoom and Chime - but I do hear friends who have to use it complain about it all the time. (Zoom is better than it used to be, but for all that is holy please get rid of the floating menu when we're sharing screens)


One of the results for hilbert curve marble tracks, mentioned elsewhere in the thread, was a video showing how to make one in blender, which has a physics engine so it can simulate it pretty well.

https://www.youtube.com/watch?v=8YeXyUNCnhM

I'd imagine that the 3d-printable models could be imported into blender, so it's 'just' adding balls and motion to the lift.


You can simulate everything in these professional (and expensive) software.

https://ansyshelp.ansys.com/public/account/secured?returnurl...

But for hobby purposes I would suggest to contact some university, they have such software, and they could find simulation of balls motion at marble fountain interesting for research (and educational) purposes.


You can also get reasonable results from using quasirandom sequences https://extremelearning.com.au/unreasonable-effectiveness-of... - which are trivial to generate.

That's the kind of thing I use dithering on BBC Micro because it's such a cheap technique, here in a thread directly comparing to Bayer-like dithering https://hachyderm.io/@bbcmicrobot@mastodon.me.uk/11200546490... or here faking the Windows XP desktop https://hachyderm.io/@bbcmicrobot@mastodon.me.uk/11288651013...


Also, disabling scrolling and using swipe for sections instead _at a font size that causes text to overflow_, depending on phone screen size, meaning a bunch of the site is _literally_ unreadable, since it's off the screen with no way to get there.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: