Call 's' the number of bits of the maximum of the two numbers, 'n' and 'm', that you're trying to multiply (that is s = max(lg(n), lg(m))).
Standard "long multiplication" works in O(s^2) time [1].
To speed things up, consider the univariate polynomial whose coefficients are the base representation of each number (for example 10 = p(2) = 0 + 12^1 + 02^2 + 12^2). Multiplying two polynomials algebraically (h(x) = p(x)g(x) = sum (i,j = k) p_ig_i x^k) then evaluating at some point (e.g. '2') is the same as multiplying the points of each polynomial at 's' different points and then using the resulting points to interpolate back to the multiplied polynomial and then evaluating at the same "base" point (e.g. '2').
That is, take 's' points, w_k, evaluate p(x) and g(x) at each of the w_k points, multiply each of the evaluated points together (that is, p(w_k) * g(w_k)) then interpolate back to get the multiplied polynomial, h(x). Evaluate h(x) at the original base (e.g. '2') to get the resulting number.
Take each of the w_k's that have some symmetry, say the roots of unity, and you can exploit various tricks that are either the Fourier transform or similar to go back and forth between the algebraic representation and the 'point representation' in O(s lg(s)) time.
The bit complexity of the evaluated points along with a host of other factors meant that the algorithms discovered were O(s lg(s) lg(lg(s))) or nearby. I would recommend "Fundamental Algorithms of Algorithmic Algebra" for a good introduction [2]. The article mentions the various methods but they go by Karatsuba [3], Shonhage and Strassen [4] among others.
I haven't read the paper but it looks like they're exploiting some 'symmetry' of multivariate rings that are each "simple" and of the form (x^r -1), where r divides some exact power of 2 and does some Fourier transform like tricks in each of those rings.
Standard "long multiplication" works in O(s^2) time [1].
To speed things up, consider the univariate polynomial whose coefficients are the base representation of each number (for example 10 = p(2) = 0 + 12^1 + 02^2 + 12^2). Multiplying two polynomials algebraically (h(x) = p(x)g(x) = sum (i,j = k) p_ig_i x^k) then evaluating at some point (e.g. '2') is the same as multiplying the points of each polynomial at 's' different points and then using the resulting points to interpolate back to the multiplied polynomial and then evaluating at the same "base" point (e.g. '2').
That is, take 's' points, w_k, evaluate p(x) and g(x) at each of the w_k points, multiply each of the evaluated points together (that is, p(w_k) * g(w_k)) then interpolate back to get the multiplied polynomial, h(x). Evaluate h(x) at the original base (e.g. '2') to get the resulting number.
Take each of the w_k's that have some symmetry, say the roots of unity, and you can exploit various tricks that are either the Fourier transform or similar to go back and forth between the algebraic representation and the 'point representation' in O(s lg(s)) time.
The bit complexity of the evaluated points along with a host of other factors meant that the algorithms discovered were O(s lg(s) lg(lg(s))) or nearby. I would recommend "Fundamental Algorithms of Algorithmic Algebra" for a good introduction [2]. The article mentions the various methods but they go by Karatsuba [3], Shonhage and Strassen [4] among others.
I haven't read the paper but it looks like they're exploiting some 'symmetry' of multivariate rings that are each "simple" and of the form (x^r -1), where r divides some exact power of 2 and does some Fourier transform like tricks in each of those rings.
Pretty cool that they've reduced it down so much.
[1] https://en.wikipedia.org/wiki/Multiplication_algorithm#Grid_...
[2] https://www.amazon.com/Fundamental-Problems-Algorithmic-Alge...
[3] https://en.wikipedia.org/wiki/Multiplication_algorithm#Karat...
[4] https://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strasse...