Mathematicians discover a perfect way to multiply (2019)

dreamcompiler · on July 23, 2020

BTW you may have noticed that the paper multiplication method we all learned in third grade is digit-by-digit convolution, which is an O[n^2] operation. But what does convolution in the time domain become when we switch to the frequency domain? It becomes digit-by-digit multiplication, which is O[n]. So now you see why the FFT is essential here. The extra factors of logn and log(logn) are about taking the FFT itself in both the forward and reverse directions.

So fast multiplication is about looking at numbers as signals and doing signal processing on them.

photonemitter · on July 23, 2020

There’s a theorem known as the Convolution Theorem: https://en.m.wikipedia.org/wiki/Convolution_theorem

Much used in simplifying kernel operations and convolutions (and some other nifty tricks.)

Another useful idea is also that in the domain of the fourier transformation we have exponentials (Fourier series are some series of $ c_n e^inx$), and when multiplying exponentials we get $ e^ix * e^iy = e^i(x + y) $

Moreover this is usually coupled with the case where we integrate on some periodic signal (so it’s integrated from 0 to 2pi, and unless the product of e^i(x+y) = e^0 = 1, then the integral becomes 0 as well. )

s5ma6n · on July 23, 2020

Thank you for the nice explanation. Clears up the important point that article was not quite able to do so. Even though it feels unintuitive to apply a Fourier transform to numbers for multiplication purposes, it works nicely. Fascinating.

albertzeyer · on July 22, 2020

> Regardless of what computers look like in the future, Harvey and van der Hoeven’s algorithm will still be the most efficient way to multiply.

This is only true if you just look at the big O notation. However, if you also look at the constant involved to get C * n * log(n) + C2 as an upper bound for the number of operations for the calculation, there can be faster algorithms with lower constants C and C2. And these algorithms also will be of practical interest.

Also, as I understand it, it was not actually proven (yet) that O(n log(n)) is the fastest possible class.

Also, you could also study the average runtime, instead of the worst case upper bound. There are cases where the multiplication can be much more efficient than O(n log(n)). An algorithm could be clever and detect many such cases and then maybe end up in a strictly better average runtime class. Or I'm not sure if it is proven that this cannot be the case.

Similarly, sorting algorithms are still optimized more and more, although the worst case complexity is always O(n log(n)) (ignore integer sorting for now). But average and best case runtime complexity is relevant. And memory usage complexity as well. And other properties. And of course also these constants which matter in practice.

A big C2 basically corresponds to the galactic algorithms, mentioned in another thread here.

eru · on July 23, 2020

Another thing to consider: sometimes you don't need to multiply precisely. If you can live with small errors, further speedups are possible.

mellosouls · on July 22, 2020

While this approach may have practical uses, the "sufficiently large numbers" bound got me reading about Galactic Algorithms...

https://en.wikipedia.org/wiki/Galactic_algorithm

adfsgkinio · on July 22, 2020

I found this line amusing:

>This means it will not reach its stated efficiency until the numbers have at least 2^1729^12 bits (at least 10^10^38 digits), which is vastly larger than the number of atoms in the known universe.

"Vastly larger" is one of the biggest understatements I've ever heard. It's roughly 10^10^38 times larger.

I can't think of a better way to write that line. I just found it funny.

wetmore · on July 23, 2020

Neat seeing 1729 show up there.

https://en.wikipedia.org/wiki/1729_(number)

perl4ever · on July 23, 2020

Wikipedia also says that 1729 was the year of the largest/brightest comet ever seen.

https://en.wikipedia.org/wiki/Comet_of_1729

eru · on July 23, 2020

> This means it will not reach its stated efficiency until the numbers have at least 2^1729^12 bits (at least 10^10^38 digits), which is larger than the number of atoms in the known universe.

This version might be better? If you like understatement, that is.

userbinator · on July 22, 2020

The 10^38 caught my attention - I wonder if there's any relationship to the range of IEEE single-precision floating-point?

zamadatix · on July 23, 2020

(1729^12)/(maxfloat) is roughly 2.0974 so probably just coincidence with one of many common large ranges vs a direct relation.

rendall · on July 23, 2020

"For millennia it was widely assumed that there was no faster way to multiply. Then in 1960, the 23-year-old Russian mathematician Anatoly Karatsuba... found it."

I have a bizarre mental tic wherein I will often read a story like this and then fantasize about going back in time to just before the discovery, and present it myself, in essence stealing it. I'll get quite far along in my daydream before the absurdity collapses the narrative.

Are there others of my kind, here?

EmilioMartinez · on July 23, 2020

My fantasies are a bit more charitable: Taking long dead people for a tour of the present, to see the consequences of their discoveries.

As a kid I used it as a way of analyzing the ramification of singular ideas, or understanding the complex phenomenons that interplay in technology. Eg, I would break down how a computer works (from the plug to the screen) to Benjamin Franklin, who would appreciate why you'd want to arrange silicon to make the thinking for you.

In adulthood they've quietly settled into the occasional "Man, I wish Kepler could play Kerbal Space Program".

Agentlien · on July 23, 2020

That's really funny. I love the weird fantasies we cook up.

I have a variation of this, where I travel back in time to before some long dead person did what made them famous and tell them all about their discoveries and all the wonders that came from it.

It's usually accompanied with a hint that they won't remember our conversation, so it won't affect history, but they'll enjoy marveling at it for a short while.

Xelbair · on July 23, 2020

I usually have a fantasy where i travel back in time to middle ages/ancient times and try to:

* convince people i am not a devil worshipper/magician/time traveler

* explain how some concept works.

Imagine trying to explain electricity to Pope from 1000 AD, or mathematical functions, and integration/derivative to Greek philosophers etc.

bArray · on July 23, 2020

I do the same, but for things that aren't yet discovered. I try to imagine what the solution might look like, how it would be presented, what the implications of it's creation will be, etc.

I really love for example the film 'Travelling Salesman' [1]. It explores some possible narrative for discovering the solution to P=NP.

One I like to think about is general human-like AI. How do I test it works correctly? How long did it take to create? What algorithms does it make use of? What are the implications in my social circles? What is the wider impact? What happens if you refuse to release the code?

One film I like that explores some ideas is 'Ex Machina' [2]. In my opinion the AI created in the film is intelligent, but lacks empathy - something I think we should seriously consider in our attempts to create AI.

[1] https://en.wikipedia.org/wiki/Travelling_Salesman_(2012_film...

[2] https://en.wikipedia.org/wiki/Ex_Machina_(film)

elbear · on July 23, 2020

I was daydreaming a lot as a kid about all sorts of fantastic stuff. At some point I stopped.

I wonder what else was gone with that daydreaming, because it feels like a thing you want to be doing, something that works your imagination.

keyle · on July 23, 2020

I often imagine long lost realities with a twist. E.g. what if the Mayans had electricity? What would the lamps look like?

mkaic · on July 23, 2020

I do the same thing all the time!

rendall · on July 23, 2020

Oh thank god. Daydreaming Temporal Plagiarists unite!

Lex-2008 · on July 23, 2020

Probably worth noting in this context, I recently started reading a book called "How to Invent Everything: A Survival Guide for the Stranded Time Traveler" which, basically, says: "If you happen to be in this timeframe, invent this trivial thing. It was invented that many years later by this person, but of course you can keep all the fame to yourself".

stormdennis · on July 23, 2020

I lack the imagination to go back with anything more than some information that will make me rich, lottery numbers usually and only back to the previous saturday.

dingle_thunk · on July 23, 2020

Not exactly, but I do often find myself trying to imagine how I would react to being the person who discovered that thing.

mikorym · on July 22, 2020

Title, paper version:

> Integer multiplication in time O(n log n)

https://hal.archives-ouvertes.fr/hal-02070778/document

andyljones · on July 22, 2020

For anyone else just looking for an outline of the new algorithm: last two paragraphs of p4, first two paragraphs of p5.

Coming from a position of a few abstract algebra classes in college many years ago, all the words and notation are familiar but I am a long way from being able to follow it.

nurettin · on July 22, 2020

Previously on HN https://news.ycombinator.com/item?id=19474280

dang · on July 22, 2020

Also https://news.ycombinator.com/item?id=19672835

and https://news.ycombinator.com/item?id=19644374

mumbisChungo · on July 22, 2020

I think I'm misunderstanding something about the article. In what way is this method perfect, as is mentioned multiple times? Is it not just an incremental improvement over standard practice?

hinkley · on July 22, 2020

In computational complexity, we have some proofs that a certain cost is the absolute bare minimum that an optimal solution must expend. We have the same for some mechanical systems too.

If you get the order down to the proven limit, especially for the first time, people sometimes use the word 'perfect', even if there are potentially other solutions with lower constant overhead, or lower practical overhead (see also Timsort, which uses insertion sort for leaf operations).

> Second, in that same paper Schönhage and Strassen conjectured that there should be an even faster algorithm than the one they found — a method that needs only n × log n single-digit operations — and that such an algorithm would be the fastest possible.

Is this saying that there is a proof in there paper that nlogn is the cheapest? If so it certainly beats around the bush.

EvgeniyZh · on July 22, 2020

There is a (conditional) proof that nlogn is optimal: https://arxiv.org/abs/1902.10935

remcob · on July 22, 2020

A conjecture is a proposition that is suspected to be true. So there would be no proof in the paper, but likely some arguments why the authors expect it to be true.

hinkley · on July 22, 2020

If a scientist were writing the article about the paper, yes. But everyone has been quick to point out is definitely not the case.

timwaagh · on July 22, 2020

No (people think there isn't but it's not been proven).

ghastmaster · on July 22, 2020

"Perfect" in this case is a qualitative exaggeration bordering on hyperbole.

> At the end of February, a team of computer scientists at Aarhus University posted a paper arguing that if another unproven conjecture is also true, this is indeed the fastest way multiplication can be done.

mumbisChungo · on July 22, 2020

I would say it's firmly hyperbole/clickbait, upon further investigation.

bo1024 · on July 22, 2020

I think it’s fair. Nobody thinks a faster algorithm than n log n will ever be possible. It’s like a mini P!=NP.

effie · on July 23, 2020

You don't know what nobody thinks, because there may be a person that thinks just that. This is an active area of research. Anything quicker than O(nlog n) but slower than O(n) is still possible.

ogogmad · on July 22, 2020

It's perfect in the sense that its running time is O(n log n), and there is evidence that this is the fastest running time possible (at least up to a constant factor).

mumbisChungo · on July 22, 2020

And that evidence is... an unproven theory, which itself is also heavily qualified?

waynecochran · on July 22, 2020

I would say most mathematical conjectures are stronger than most scientific theories. What I mean is that the conjectures are easily falsifiable (i.e., counter examples, if found, can easily be verified) and have been shown to be true for an enormous number of of cases. So unproven mathematical conjectures are not just "wild hypotheses."

ogogmad · on July 22, 2020

According to the article, it rests on a plausible conjecture. I need to read further.

[edit]

According to the abstract of the paper referred to in the article (https://arxiv.org/abs/1902.10935) it rests on a "central conjecture in the area of network coding". Make of that what you will.

hinkley · on July 22, 2020

Yeah but in the backstory they also mentioned that Kolmogorov conjectured that the optimal solution was O(n^2).

Conjectures aren't doing so hot in this area, it seems.

SideQuark · on July 22, 2020

>Conjectures aren't doing so hot in this area, it seems.

For that one counter-example how many conjectures turned out to be true?

timwaagh · on July 22, 2020

Ah but that's not what this game is about. It's about what you can prove. And conjectures are by definition things you can't prove yet. So the longer they stand and the more paths they block the more frustrating it gets.

mumbisChungo · on July 22, 2020

From the abstract:

>In this work, we prove that if a central conjecture in the area of network coding is true, then any constant degree boolean circuit for multiplication must have size Ω(nlgn), thus almost completely settling the complexity of multiplication circuits

So, given an unproven assumption and also given that we are constrained to doing multiplication on a boolean circuit, this may be "perfect" in a sense.

Seems like the title is mostly just clickbait.

dogfoods · on July 22, 2020

It achieves a theoretical lower bound on the complexity. This is what the article says in like 1000 words because it's written for mathematical illiterates.

senkora · on July 22, 2020

They do not prove a lower bound.

> Harvey and van der Hoeven’s algorithm proves that multiplication can be done in n × log n steps. However, it doesn’t prove that there’s no faster way to do it. Establishing that this is the best possible approach is much more difficult. At the end of February, a team of computer scientists at Aarhus University posted a paper arguing that if another unproven conjecture is also true, this is indeed the fastest way multiplication can be done.

mumbisChungo · on July 22, 2020

Yes, a lower bound for multiplication using a boolean circuit, resting on an unproven conjecture.

Even if the conjecture is proven to be true, we may find a more efficient way to perform this operation than a boolean circuit.

JakeStone · on July 22, 2020

Gotta admit, I started chuckling at “We use [the fast Fourier transform] in a much more violent way..." I've known some math geeks in my day, and this sounded like them.

amelius · on July 22, 2020

> Two decades ago, computers performed addition much faster than multiplication. The speed gap between multiplication and addition has narrowed considerably over the past 20 years to the point where multiplication can be even faster than addition in some chip architectures. With some hardware, “you could actually do addition faster by telling the computer to do a multiplication problem, which is just insane,” Harvey said.

Is this true for popular architectures such as Intel and ARM?

corsix · on July 22, 2020

An Intel Haswell chip, when doing floating point vector operations, can dispatch one addition per cycle, or two fused-multiply-add. The latency is worse for the FMA (5 cycles versus 3 cycles), but if you've got enough parallel accumulations to hide that latency, then using FMAs for addition will be faster than addition.

This is a consequence of Intel optimising for FLOPs (or - equivalently - for matrix multiplication), as an FMA counts as two FLOPs, versus addition counting as just one.

ummonk · on July 22, 2020

Well the algorithm for floating point addition is very complicated so regardless of whether they were gaming the flops you would expect floating point multiplication to be at least as fast as floating point addition on most architectures.

amelius · on July 22, 2020

> the algorithm for floating point addition is very complicated

Ah yes, of course. I think it is slightly misleading that the article is mostly about large integer multiplications, while that remark holds for floating point operations.

gowld · on July 22, 2020

Is that just because multiplication is so commonly needed that there is double hardware allocated for FMA?

Like, if I had a special purpose adding machine and a special purpose multiply machine, and I bought a second multiply machine, then I could multiply large numbers faster than adding, but that's not insane, that's resource allocation.

jcranmer · on July 22, 2020

A floating point add works roughly like this:

0. Extract mantissa and exponents from the inputs [basically free in a HW implementation].

1. Combine the two exponents and shift the mantissas so that they same the share scale.

2. Add the two mantissas together.

3. Normalize the result back into a floating-point number.

Now look at the process for a floating-point multiply operation:

0. Extract mantissa and exponents from the inputs [basically free in a HW implementation].

1. Add the two exponents together.

2. Multiply the two mantissas together.

3. Normalize the result back into a floating-point number.

Steps 0 and 3 are identical in both cases. I'm not really knowledgeable about hardware multiplier implementations, but doing an additional "add" operation can be pretty close to free in some implementations. In any case, adding in an extra adder to the process is going to be quite cheap, even if not free.

What this means is that there is not much extra hardware to turn a FMUL ALU op into an FMA ALU op... and if you make just a single FMA ALU op, you can use that hardware to implement FADD, FMUL, and FMA with nothing more than microcode.

In other words, instead of thinking of FMA as something worth pouring more resources into, think of it as FADD as not worth enough to be made its own independent unit.

dreamcompiler · on July 23, 2020

FP units are not used in integer bignum operations. I'm still curious if "multiplies can be faster" applies to modern integer ALUs.

jabl · on July 23, 2020

Many important numerical algorithms like dot products, matrix multiplication, FFT can be implemented in terms of FMA. So yes, it makes sense for processor designers to devote resources to FMA units.

Also, FMA units can be used for pure adds or multiplies as well. Not sure of instruction sets that have separate instructions for addition and multiplication in addition to FMA (like x86), are they all executed on the same functional units or are there separate units. And if executed on the same functional units, what's the limitation preventing only one add per cycle vs. two FMA's as on Haswell.

bsder · on July 23, 2020

No. And it's not true anywhere.

An integer multiplication always needs a "carry propagate add" step at the end. Consequently, addition is always faster because that's the final step of a multiplication. (Floating point is a little different, but not significantly so).

IF you have a chain of additions AND the multiplier has a way to forward the "carry save" representation (ie. the step right before the carry propagate add) AND your carry propagate add takes longer than a single clock cycle, you could theoretically get better performance by feeding the adds into the multiplier accumulation matrix (which is just a big matrix for doing a whole bunch of adds simultaneously).

That's a lot of "and" and "if". Nobody does it that way. The best you get is "fused multiply add".

ummonk · on July 22, 2020

Not really. Afaik integer multiplication still usually has lower instruction per cycle throughput than addition and obviously even if it were faster that would only apply to 32 but 64 bit etc integers not big integer multiplication.

coliveira · on July 22, 2020

I think the article is talking about floating point addition. Integer addition it is very hard to beat.

sjaak · on July 22, 2020

Respect :). I Implemented Karatsuba a while ago and wondered if an even faster method would be available. It is!

I admire the perseverance, grit, and intelligence needed to crack hard nuts like this.

jacobolus · on July 22, 2020

This new method will be marginally faster if you ever need to multiply two numbers on a scale where writing down the bits fills every plank-length cube in the known universe.

Otherwise the “practical” benefit is limited to saving a few symbols when writing abstract math papers.

the8472 · on July 22, 2020

> This new method will be marginally faster if you ever need to multiply two numbers on a scale where writing down the bits fills every plank-length cube in the known universe.

If we're talking about physical implementations then you would also have to take the delay between those planck volumes into account

croutonix · on July 23, 2020

> “The best we can hope for is we’re three times faster,” van der Hoeven said.

And they talk about billions of digits elsewhere in the article. I doubt that only applies for the ~8.46e184 digits you seem to be talking about.

jacobolus · on July 23, 2020

From last time this came up, the number of bits in the numbers involved is 2^(9^12). Storing a count of the number of bits takes about 930 billion digits. Actually storing the bits or digits of the numbers per se takes a whole lot more, to say the least.

https://hal.archives-ouvertes.fr/hal-02070778/document

MauranKilom · on July 23, 2020

Surely a practical benefit can come from this result being used in other proofs/systems, which then have real-world impact?

jacobolus · on July 24, 2020

Yes, this result will be used to save a few printed symbols each time the topic comes up in future mathematics papers, and the beauty of the slightly simpler mathematical expressions will spark joy in mathematicians reading them.

dreamcompiler · on July 23, 2020

Karatsuba is a beautiful algorithm and as good as it gets for "small" numbers (on the order of a few thousand bits or less). But when you get bigger than that you need the FFT methods, which are impractical for "small" numbers. So the two algorithms complement each other nicely.

rdtsc · on July 22, 2020

The traditional method in the article looks a bit different from how I learned it, which was this way https://www.wikihow.com/Do-Double-Digit-Multiplication,

The overall description of it looks right, but in the picture the order of operations would be more like B, C, D, A not A, B, C, D

Sharlin · on July 22, 2020

Yeah, that's a bit nonstandard order but multiplication, of course, is associative and commutative, so it doesn't really matter.

JadeNB · on July 22, 2020

Indeed—the advantage of doing it as you describe is that each new multiplication gives you a new digit that can be recorded immediately, rather than having to record the '2' from the 2 x 6 multiplication and later amend it to '5'.

peter_d_sherman · on July 25, 2020

>"Over the past decade, mathematicians have found successively faster multiplication algorithms, each of which has inched closer to n × log n, without quite reaching it.

Then last month, Harvey and van der Hoeven got there.

Their method is a refinement of the major work that came before them. It splits up digits, uses an improved version of the fast Fourier transform, and takes advantage of other advances made over the past forty years. “We use [the fast Fourier transform] in a much more violent way, use it several times instead of a single time, and replace even more multiplications with additions and subtractions,” van der Hoeven said.

Harvey and van der Hoeven’s algorithm proves that multiplication can be done in n × log n steps.

However, it doesn’t prove that there’s no faster way to do it."

glouwbug · on July 22, 2020

Looks like Joris is now at SFU according to his homepage: http://www.texmacs.org/joris/main/joris.html.

Would be cool to run into him someday

bordercases · on July 22, 2020

Yeah now that SFU got the hosting for Canada's West Coast supercomputation, they've been attracting top-shelf talent.

glouwbug · on July 23, 2020

Yeah, a buddy of mine works at D-Wave. They held the world record for the longest quantum state (or maybe still do, I think it was like 15 mins or something) and I'd joke that they couldn't get past 15 mins because they'd lift the lid to check if the "quantum state was still okay".

But seriously, D-Wave is doing some pretty amazing stuff. The Lower Mainland, in general, is an awesome place to be career wise. Just a shame about the housing situation

thelazydogsback · on July 22, 2020

This is amazing to me: > "Their conjecture was based on a hunch that an operation as fundamental as multiplication must have a limit more elegant than n × log n × log(log n)"

Only in maths (and sometimes in its "implementation"/ expression in nature in conjunction with physical systems) can such a subjective viewpoint be so powerful.

Also blows me away, at least at first, that multiplying using an FFT would be the fastest method so far. (I guess I have an older mindset based on how slow FFTs used to be.) I wonder in actual implementations at how many digits the perf curves meet at break-even compared to other methods.

gowld · on July 22, 2020

When you say FFT is slow, did you know any way to multiply large numbers that was fast? Or you assumed it was without evidence?

thelazydogsback · on July 22, 2020

perl4ever · on July 23, 2020

I completely missed this when I first looked at the link, but based on another comment, the break even point is considerably beyond the size of any number that would fit in the universe as we know it. Or any number you could write down if you had a universe for each atom in our universe, etc.

...so it's not really practical.

jiveturkey · on July 23, 2020

Does this apply to division? division being multiplication by the inverse. I wonder because

> And while the new algorithm is important theoretically, in practice it won’t change much, since it’s only marginally better than the algorithms already being used. “The best we can hope for is we’re three times faster,” van der Hoeven said. “It won’t be spectacular.”

wow, what an understatement! 3x is huge.

so if it speeds up factorization, even by 1.5x, then that's significant for RSA isn't it?

gerdesj · on July 22, 2020

There's a constraint that I as a civilian am missing. Could someone please put me out of my misery:

n.m (in integers) can be computed by starting with zero and adding m to it, n times. That's n operations. Errrrr cough , assuming that the computer can deal with an arbitrarily large n and m.

ot · on July 22, 2020

When analyzing complexity, "n" is conventionally the input size. For arithmetic operations it would be the number of digits, not the number itself.

tripa · on July 22, 2020

Numbers that exceed the machine's word size have a common representation whose length is O(lg N). So a simplistic upper bound for computing N×M would be O(min(N,M) × lg(N×M))

rat9988 · on July 22, 2020

You do n times m operation. that's nm.

gerdesj · on July 23, 2020

10.10 = 10 + 10 + 10 + 10 ...

That's 10 operations.

10,000,333,999 . 33,123,456

That's 33123456 operations.

reikonomusha · on July 23, 2020

How much does each operation cost? Adding together two 5 digit numbers and two 5,000 digit numbers should be counted differently, don’t you agree? So perhaps we should count the number of single-digit operations required.

In that case, adding two five digit numbers will require 5 (or so) additions, and two 5,000 digit numbers will require 5,000 (or so) additions.

What about a 5,000 digit * 5,000 digit multiplication?

Just the very first “big” addition in your scheme will require 5,000 single-digit additions. But that result might now be 5,001 digits long (think of how 99+99 results in a 3 digit sum). Do it again, and if you have carries, it might be 5,002 digits long.

So you’ll have 5,000 + 5,001 + ... = 5000N + (1 + 2 + ... + N) operations. How many times does this repeat, that is, what is N? Well, however big your multiplier is! Your multiplier is around N = 10^5000. So you’ll need a total number of additions approximately equal to

    5000 * 10^5000 + 1 + 2 + ... + 10^5000

And what is 1 + 2 + ... + N? It’s N(N+1)/2. So we have

    5000 * 10^5000 + 10^5000*(10^5000 + 1) / 2

which we might as well estimate to be

    (10^5000)^2.

That’s a _lot_ of operations. And that’s an _under_ estimate! Even grade school multiplication is a lot simpler (if you count single-digit multiplication to have a cost of 1): it only requires 5000^2 elementary operations.

The goal of these funny multiplication algorithms is to beat (number of digits)^2 single-digit operations. What the article shows is that of you have enough digits, you can do only

    (number of digits) * (number of digits in the number of digits)

operations to do the multiplication, perhaps off by some fixed, constant factor.

dreamcompiler · on July 23, 2020

Do it on paper using the method you learned in third grade. How many times above do you combine a single pair of digits? On the left it's 4 (0x0, 0x1, 1x0, and 1x1). Which is O[n^2] and n=2 above.

On the right it's 9 pairwise operations for the ones column plus 9 for the tens column (but it would usually be 10 if we had any carries) for a total of 18. A lot more than 4.

Oh but you say "the computer doesn't have to do those adds digit by digit. It can do all the digits in parallel." That's only true when the numbers fit into a register. If your numbers are not 64 bits wide but 64 thousand bits wide, the computer has to effectively do addition digit by digit. Which is why the left side above is quicker than the right, and the O[n logn] method in the article is better still.

zitterbewegung · on July 22, 2020

If the title had "almost the perfect way" it would be the perfect way to write this title.

But, leaving that out increases your earnings by a constant factor.

MauranKilom · on July 23, 2020

It's not yet known if it's only "almost". If it is indeed optimal, then "almost" would be wrong (from a big-oh perspective).

abetusk · on July 23, 2020

Call 's' the number of bits of the maximum of the two numbers, 'n' and 'm', that you're trying to multiply (that is s = max(lg(n), lg(m))).

Standard "long multiplication" works in O(s^2) time [1].

To speed things up, consider the univariate polynomial whose coefficients are the base representation of each number (for example 10 = p(2) = 0 + 12^1 + 02^2 + 12^2). Multiplying two polynomials algebraically (h(x) = p(x)g(x) = sum (i,j = k) p_ig_i x^k) then evaluating at some point (e.g. '2') is the same as multiplying the points of each polynomial at 's' different points and then using the resulting points to interpolate back to the multiplied polynomial and then evaluating at the same "base" point (e.g. '2').

That is, take 's' points, w_k, evaluate p(x) and g(x) at each of the w_k points, multiply each of the evaluated points together (that is, p(w_k) * g(w_k)) then interpolate back to get the multiplied polynomial, h(x). Evaluate h(x) at the original base (e.g. '2') to get the resulting number.

Take each of the w_k's that have some symmetry, say the roots of unity, and you can exploit various tricks that are either the Fourier transform or similar to go back and forth between the algebraic representation and the 'point representation' in O(s lg(s)) time.

The bit complexity of the evaluated points along with a host of other factors meant that the algorithms discovered were O(s lg(s) lg(lg(s))) or nearby. I would recommend "Fundamental Algorithms of Algorithmic Algebra" for a good introduction [2]. The article mentions the various methods but they go by Karatsuba [3], Shonhage and Strassen [4] among others.

I haven't read the paper but it looks like they're exploiting some 'symmetry' of multivariate rings that are each "simple" and of the form (x^r -1), where r divides some exact power of 2 and does some Fourier transform like tricks in each of those rings.

Pretty cool that they've reduced it down so much.

[1] https://en.wikipedia.org/wiki/Multiplication_algorithm#Grid_...

[2] https://www.amazon.com/Fundamental-Problems-Algorithmic-Alge...

[3] https://en.wikipedia.org/wiki/Multiplication_algorithm#Karat...

[4] https://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strasse...

papito · on July 23, 2020

[flagged]

papito · on July 23, 2020

[flagged]

eru · on July 23, 2020

[flagged]

dang · on July 23, 2020

Humor is fine, just not lame humor, which most internet humor is.

https://news.ycombinator.com/item?id=7609289

eru · on July 23, 2020

Yes. I didn't want to be so blunt.

When in doubt, the easier rule of thumb is to go without humor here.

mgrennan · on July 22, 2020

I can't believe this is a thing. I showed this to my grade school teacher and she told me I was doing it WRONG. I just new it worked so I kept using it.