I'm sure this is plenty useful for less experienced people, but the "smart" hacks read a bit like:
Hack 1: Don't Use The Obviously Wrong Data Structure For Your Problem!
Hack 2: Don't Have The Computer Do Useless Stuff!
Hack 3: Don't Allocate Memory When You Don't Need To!
And now, a word from our sponsor: AI! Use AI to help AI build AI with AI, now with 15% more AI! Only with AI! Ask your doctor if AI is right for you!
It's worth pointing out that a few of them are Python-specific. Compilers can inline code, there's usually no need to manually inline functions in most languages, that's Python being Python.
Which scope the function is from being important is quintessentially Python being Python.
The major gains in Python come from... not using Python. Essentially you have to rewrite your code around the fact that numpy and pandas are the ones really doing the work behind the curtain (e.g. aggressively vectorize, use algorithms that can use vectorization well rather than "normal" ones).
Number 8 of the list hints at that.
Not to mention that a lot of these performance improvements, while sane, are on the order of milliseconds of improvement. Unless you're doing one of these unoptimized approaches thousands or millions of times in a tight loop you're probably not saving a substantial amount of time/energy/computation. Premature optimization is still the root of all evil!
If you want an actual performance improvement in Python code that most people wouldn't necessarily expect: consider using regexes for even basic string parsing if you're doing a lot of it, rather than doing it yourself (e.g. splitting strings, then splitting those strings, etc.); while regexes "feel" like they should be more complicated and therefore slower or less efficient, the regex engine in Python is implemented in C and there's a decent chance that, with a little tweaking, even simple string processing can be done faster with a regex. Again only important in a hot loop, but still.
“Hacks” 4-10 could easily be replaced with “use numpy.” Performance gains from doing math better in pure Python are minimal compared with numpy. It’s not unusual for the numpy version of something to end up taking 0.01x as long to run.
A lot of interesting math can't be done in numpy, sadly. At that point you might be better off writing the initial version in Python and translating it to something else.
A friend of mine asked me to translate some (well-written) number theory code a while back, I got about a 250x speedup just doing a line by line translation from Python to Julia. But the problem was embarrassingly parallel, so I was able to slap on an extra 40x by tossing it on a big machine for a few hours for a total of 10,000x. My friend was very surprised – he was expecting around a 10x improvement.
I 'wrote' (adapted from the Rich project's example code) a simple concurrent file downloader in Python; run 'download <any number of URLs>' and it goes and downloads each one, assuming that the URL has what looks like a filename at the end or the server response with a Content-Disposition header that contains a filename. It was very simple; spawn a thread for each file we're downloading, show a progress bar for each file we're downloading, update the progress bar as we download.
I ended up rewriting the whole thing in Rust (my first Rust project) solely because I noticed that just that simple process - "get some bytes from the network, write them to this file descriptor, update the progress bar's value" was churning my CPU due to how intensive it was for the progress bar to update as often as it was - which wasn't often.
Because of how ridiculous it was I opted to rewrite it in another language; I considered golang but all of the progress bar libraries in Golang are mediocre at best, and I liked the idea of learning more Rust. Surprise surprise, it's faster and more efficient; it even downloads faster, which is kind of ridiculous.
An even crazier example: a coworker was once trying to parse some giant logfile and we ended up nerd-sniping ourselves into finding ways to speed it up (even though it finished while we were doing so). After profiling this very simple code, we found that 99% of the time in processing each line was simply parsing the date, and 99% of that was because Python's strptime is devoted to being able to parse timezones even if the input you're giving it doesn't include one. We played around with things like storing a hash map of "string date to python datetime" since there were a lot of duplicates, but the fastest method was to write an awful Python extension that basically just exposed glibc's strptime so you could bypass Python's (understandably) complex tz parsing. For the version of Python we were using it made parsing hundreds of thousands of dates 47x faster, though now in Python3 it's only about 17x faster? Maybe less.
I still use Python all the time because usually the time I save writing my code quickly more than outweighs the time I spend having slower code overall; still, if your code is going to live a while, maybe try running it through a profiler and see what surprises you can find.
Yes I didn't realize anyone actually did anything numerical without numpy. I don't think I've ever imported python's math module once. Who in their right mind is making a 1e6 long python list
Hack 1: Don't Use The Obviously Wrong Data Structure For Your Problem!
Hack 2: Don't Have The Computer Do Useless Stuff!
Hack 3: Don't Allocate Memory When You Don't Need To!
And now, a word from our sponsor: AI! Use AI to help AI build AI with AI, now with 15% more AI! Only with AI! Ask your doctor if AI is right for you!
It's worth pointing out that a few of them are Python-specific. Compilers can inline code, there's usually no need to manually inline functions in most languages, that's Python being Python. Which scope the function is from being important is quintessentially Python being Python.
The major gains in Python come from... not using Python. Essentially you have to rewrite your code around the fact that numpy and pandas are the ones really doing the work behind the curtain (e.g. aggressively vectorize, use algorithms that can use vectorization well rather than "normal" ones). Number 8 of the list hints at that.