It's slower but maybe the target audience is different? Armadillo prioritizes MA...

touisteur · on April 18, 2024

On this exact sequence, is there a LLM of choice that is really performant in this translation task? To armadillo, Eigen, Blaze or even numpy?

I have had very little success with most of the open self-hosted ones, even with my 4xA40 setup, as they either don't know the c++ libraries or generate very good-looking numpy stuff, full of horrors, simple and very very subtle bugs...

Looking for the same thing from any linear algebra library or language to cuda BTW (yes, calls to cu-blas/solver/sparse/tlass/dnn are OK), I haven't found one model able to write cuda code properly - not even kernels themselves but at least chaining library calls.

Probably doesn't exist (invoking Cunningham's Law).

itishappy · on April 18, 2024

Linear algebra routines seem like one of the worst possible use cases for current LLMs.

Large amounts of repetitive yet meaningfully detailed code. Algorithms that can (and often are) implemented using different conventions or orders of operations. Edge cases out the wazoo.

A solid start seems like it would be using LLMs to write extensive test suites which you can use to verify these new implementations.

touisteur · on April 19, 2024

Yet for me all this C++/CUDA code is a lot of boilerplate to express dense and supposedly very tired concepts. I thought LLMs were supposed to help with the boilerplate. But yeah I guess it won't work.

And yes, it's nice to build unit test and benchmark harnesses. But those were never really such time-wasters for me.

flemishgun · on April 17, 2024

Tough to say something as blanket as "it's slower"... there are lots of operations in any linear algebra library. It's not a direct comparison with other C++ linear algebra libraries, but hard to say Armadillo is slow based on benchmarks like this:

https://conradsanderson.id.au/pdfs/sanderson_curtin_armadill...