I heard of pigz in the discussions following my interview of Yann Collet, creato...

ttul · on May 12, 2023

Side note: In the 1990s, everyone in my engineering school had an HP48 calculator. There were a healthy selection of pretty decent games available.

One fine day, I finished my physics exam an hour early and so opened up an enjoyable game on my calculator. 45 minutes went by and so I went up and handed in my paper. It was at this point that the professor noted, “were you planning on leaving the second page blank?”

Oh.

dheera · on May 12, 2023

I had an HP48, I had lots of fun with Bjorn Gahm's IR remote control which could mimic the remote control signals of most common television brands, including the ones at my high school. The HP48 also had the ability to set an alarm to execute an arbitrary piece of code, e.g. turning on the TV to some channel and turning the volume up to max at some predefined time the middle of class.

Teachers were completely stumped. Their initial suspicions were always that someone brought a universal remote control to class, but they would painstakingly search everyone's desks to find nothing. And then after asking everyone to put their hands up in the ai, the TV would still have a mind of its own.

Yeah, your TI-89 was no fun.

FPGAhacker · on May 12, 2023

Still have mine. For a while on my phone I used an emulator. Eventually I found PCalc and it was customizable enough to recreate the parts of the hp48g that I cared about on a day-to-day basis.

adamgordonbell · on May 12, 2023

One wild thing is how much performance wins were available compared to ZLib. Pigz is parrellel, but what if you just had a better way to compress and decompress than DEFLATE?

When zstd came out – and Brotli before it to a certain extent – they were 3x faster than ZLib with a slightly higher compression ratio. You'd think that such performance jumps in something as well explored as data compression would be hard to come by. We weren't that close to the efficiency frontier.

twotwotwo · on May 12, 2023

A part of this was that zstd and Brotli were able to use compression history windows of MBs not KBs, while DEFLATE maxes out at 32KB. RAM was thousands of times more expensive in the early 90s, so a smaller history window made sense.

There are also optimizations that only work on today's larger cores, and you have to actively update old code to get the advantages (happily some work is going into that): https://news.ycombinator.com/item?id=32533061 / https://news.ycombinator.com/item?id=32537545

That's not to minimize the clever ideas and amazing implementation work in new stuff. It's more that people were making smart decisions both then and now, more so than you might guess just from comparisons on today's hardware.

bick_nyers · on May 12, 2023

People take performance for granted. Even within gzip (and similarly .png), you can set compression level to 4 (default is 6) and get ~15-20% faster performance at the cost of ~5% larger file sizes.

No one ever tweaks that one setting even though they should, file sizes are a significantly smaller bottleneck than they were with MB hard drives and dial-up modems.

If your justification for not serving up larger .png is that not everyone has fast internet, then you should be either detecting and handling that case separately, downscaling the images, and/or serving .jpeg instead.

One time I was using Topaz AI to upscale video, and I spliced that into their ffmpeg filter and took a whole day off of a week long encode. Low hanging fruit.

PaulHoule · on May 12, 2023

In video games long loading times for levels is a serious pain point, so video game developers put a lot of effort into tuning up compression algorithms to get the best wall clock time considering both the time to fetch content from storage and the time to decompress.

If the target is a console you may know exactly what hardware is there so you can justify the effort in tuning. (it’s more complex today because you have a choice of what kind of storage to use with your XBOX). With a PC or phone your results may vary a lot more.

bick_nyers · on May 12, 2023

Don't most games/game engines use TGA format for their textures? Those are all RLE-encoded if I'm not mistaken (which is very fast but very inefficient space-wise). Or perhaps that is just at game creation and those will get baked to some other image format for distribution?

PaulHoule · on May 12, 2023

People use all kinds of compression schemes for textures

https://aras-p.info/blog/2020/12/08/Texture-Compression-in-2...

astrange · on May 13, 2023

The most important thing for modern texture compression is that the GPU supports it without ever having to decompress - saves VRAM and memory bandwidth. So it's usually specialized formats like ASTC.

Taywee · on May 12, 2023

Economics of scale come into effect as well. Gzip decompression speed is slightly better at higher levels as well. A one time higher cost of compression can pay off pretty quickly when you are decompressing it a lot of times, or serving it to enough people.

bick_nyers · on May 12, 2023

I'm not so sure about this. Generally speaking there will be more work done on the CPU to decompress at higher levels (e.g. 6 through 9). It is possible (although unlikely) that you will get higher decompression speed, but only if the bottleneck wasn't CPU to begin with (e.g. network or disc).

My gut feeling is that if you are pulling down data faster than 40 Megabits and have a CPU made within the past 7 years (possibly including mobile), you won't be bottlenecked by I/O generally speaking.

Taywee · on May 12, 2023

Most compression algorithms don't take more work to decompress at higher levels, and actually perform better due to having less data to work through. Gzip consistently benchmarks better at decompression for higher levels.

It's not just about bottlenecks, but aggregate energy expenditure from millions of decompressions. On the whole, it can make a real measurable difference. My point was only really that it's not so cut and dry that it's a good trade off to take a 5% file size loss for 20% improved compression performance. You'd have to benchmark and actually estimate the total number of decompressions to see the tipping point.

diroussel · on May 13, 2023

That is not the case with zstd.

According to this benchmark [1] zstd does not drop its decompression speed as the compression ratio increases. It stays about the same level.

[1] https://www.truenas.com/community/threads/zstd-speed-ratio-b...

sangnoir · on May 12, 2023

> No one ever tweaks that one setting even though they should

That entirely depends on the use-case. Most people running FFMpeg do it as a once off thing - and if those people like me, when I rip a movie I want the highest quality and lowest size I can get, and I'm happy that the default sacrifices speed for quality and size. The processing can be slow because I'm doing it only once. If you're in the business of encoding video and do it all day everyday, your calculus will be different and you won't be using the defaults regardless.

bick_nyers · on May 12, 2023

I agree, but to be pedantic, the cost of storage may work out to be lower than the cost of energy to encode even in that use case.

usr1106 · on May 13, 2023

That depends under what conditions the energy is used. In a place where you need extra energy for cooling yes. In my home no. I live in a cold place, so I need heating the bigger part of the year. And I heat using electricity (which might be stupid, but that's the house was built 30 years ago). So whatever energy my computer wastes, I save it in my heating bill. Computer energy is free, except during some warm summer weeks.

dist-epoch · on May 12, 2023

> as well explored as data compression

What's well explored is compression rate, where indeed it's difficult to improve, and true innovations, like arithmetic coding, are rare.

Compressing speed on the other hand it's not very interesting to academics, it's more of an engineering problem. And there is plenty of work to do here, starting with stuff as simple as multi-threading and SIMD.

ZLib and ZStandard are probably in the same complexity class, but with different constant factors, which academics don't care about but which have massive practical consequences.

adamgordonbell · on May 12, 2023

> What's well explored is compression rate not performance

Exactly! And this seems like a shame to me with something burning so many cpu cycles.

> true innovations, like arithmetic coding, are rare.

Yeah, Yann tried to explain arithmetic coding to me, but I didn't get it.

PhilipRoman · on May 12, 2023

I think arithmetic coding is much simpler than the way most resources describe it (the hard part is making it efficient).

Consider this example: you want to transmit two completely random variables, both of which can have 5 states. The obvious way is to concatenate two bit fields of size ceil(log2(5)), so 3+3 = 6 bits.

But alternatively, you can count the total number of states possible for both variables together, 5*5 = 25 and encode it as a single integer of size ceil(log2(25)) = 5, so both variables can be stored with just 5 bits.

So we arrive at the idea that there can be a fractional number of bits, which we often round up to the nearest integer for simplicity (or, in practice, the nearest multiple of 8 since most protocols are based on bytes).

The other part is just assigning shorter sequences of bits to more common symbols, except, of course unlike in Huffman coding, our symbols can have a fractional number of bits. This allows them to match the actual symbol probabilities more closely. If your data is highly repetitive, you can fit dozens of (common) symbols per bit.

The coolest part IMO is how easy it is to plug in custom models for symbol probabilities. Usually a simple counter for each symbol is enough, but you can go crazy and start predicting the next symbol based on previous ones.

account42 · on May 16, 2023

> The coolest part IMO is how easy it is to plug in custom models for symbol probabilities. Usually a simple counter for each symbol is enough, but you can go crazy and start predicting the next symbol based on previous ones.

PPM-based [0] compression can work quite well for text but on its own it's not enough to unseat the Lempel-Ziv family as the top general purpose compressor.

[0] https://en.wikipedia.org/wiki/Prediction_by_partial_matching

PhilipRoman · on May 16, 2023

Yeah arithmetic coding on its own is not enough, typically you would have it as the last step, with dictionary and other compressors before it.

tda · on May 12, 2023

That must be why you left out all the technical details ;) At least you piqued my interest, I'll just ask chatgpt to explain the general concepts.

pdw · on May 12, 2023

zlib/gzip did not choose DEFLATE because it was the best algorithm. Rather it was, at the time, the only decent algorithm that could be implemented in a manner not covered by patents. (See the tragedy of LZW for why that was important.)

We're now more than two decades later, so all the important data compression patents should have expired.

usefulcat · on May 12, 2023

Probably a lot of it is due to hardware changes since zlib was first written. Mainly the importance of cache and branch prediction, which were either less of a big deal or non-existent back then. IOW, zlib probably leaves a lot more on the table now than when it was written.

henrydark · on May 12, 2023

Even though the opposite is recited more frequently, in some cases good can be the enemy of perfect

hinkley · on May 12, 2023

zlib is above all portable, and runs in small memory footprints. Size vs space and platform specific functionality all have costs associated with them.

thepostman0 · on May 12, 2023

Middle out compression changed everything.

tda · on May 12, 2023

Just listened to that episode, what a great story. The dry way he tells how he unexpectedly and almost accidentally transitioned from a project manager to a software engineer is really a treat. Thanks for your podcast!

adamgordonbell · on May 13, 2023

Thanks for listening!

dralley · on May 12, 2023

Fastest open source compression algorithms. RAD game tools have proprietary ones that are faster and have better compression ratios, but since you have to pay for a license, they will never be widespread.

adamgordonbell · on May 12, 2023

Interesting. Are there any benchmarks you can share? On their website they only compare decompression speed and with zlib and LZMA. It would be interesting to compare to LZ4 HC mode, that unity uses.

powturbo · on May 12, 2023

Build or download TurboBench [1] executables for linux and windows from releases [2] ans make your own tests comparing oodle,zstd and other compressors.

[1] https://github.com/powturbo/TurboBench

[2] https://github.com/powturbo/TurboBench/releases

muragekibicho · on May 12, 2023

I've been following RAD for a long time and I love Charles Bloom's blog. They are proprietary but he also makes a lot of code public. For instance, he showed how RAD switched from arithmetic coders to Assymetrical Number Systems and added code.

danking00 · on May 12, 2023

This episode was fascinating. I had heard of LZ4 but not Zstd. It spurred me to make changes to our system at work that are reducing file sizes by as much as 25%. It’s great to have a podcast in which I learn practical stuff!

pmarreck · on May 12, 2023

Probably the most underrated feature of zstd (likely because it's so unusual) is the ability to create a separate compression dictionary. This allows you to develop customized and highly efficient dictionaries that are highly specific to a type of data AND allow you to compress elements of that data without including an entire separate dictionary in every compression output.

So for example take logfiles. You can train up a dictionary on some sample log data. Then you can compress individual log rows, and all it actually stores is a diff of the compression dictionary (if any new entries were added) and the compressed data. So you get very efficient compression of small amounts of data which are part of a collection that may be very self-similar, but with the option of decompressing any individual element at will. (Of course, you'd need to hold onto the original trained dictionary for both compression and decompression, for any row you want to be able to decompress in the future. And you might want to retrain the dictionary every so often for slowly-changing types of data, which might prevent "drift" of the efficiency towards less-efficient over time)

I believe Postgres already uses this under the hood for some columnar data. It wouldn't take much to index it before compressing it and just decompress it at will. Or maybe it just got added? https://devm.io/databases/postgresql-release

fnordpiglet · on May 12, 2023

I do this to make extraordinarily small UDP packets for a low latency system. I record the raw payload then build a dictionary for the data, then share it on both sides. It reduces the packet overhead by removing the dictionary and it does a much better job than other approaches.

muragekibicho · on May 12, 2023

I saw that zstd and brotli both suppport creating custom dictionaries but I couldn't find any tutorials showing how to do this. Perhaps you could share code?

muragekibicho · on May 12, 2023

I saw that zstd and brotli both suppport creating custom dictionaries but I couldn't find any tutorials showing how to do this. Perhaps you could share code?

pmarreck · on May 12, 2023

Basically,

`zstd --train <path/to/directory/of/many/small/example/files/>`

will output a dictionary file, and then the `-D <path/to/dictionary/file>` option when used for either compression or decompression will then use that dictionary first.

You can also investigate "man zstd" or google "zstd --train" for more details. The directory for the training must consist of many small files each of which is an example artifact; if you want to split, say, a single log file into files of each line, you can use, say, a bash script like this (note that I just created this with ChatGPT and eyeballed it, it looks correct but I haven't run it yet!): https://gist.github.com/pmarreck/91124e761e45d6860834eb046d6... (Also, don't forget to set it as executable with `chmod +x split_file.bash` before you try to run it directly)

muragekibicho · on May 12, 2023

Thank you so much. I was trying to create a dictionary last night and your comment was sent by God. You're doing the Lord's work frfr! I followed you on GitHub!

pmarreck · on May 12, 2023

remember that if you don't understand a particular line of code, you can have chatgpt explain it... have fun

chasil · on May 12, 2023

There are also a parallel versions of bzip2 (pbzip2), lzip (plzip), xz (pixz).

Depending upon the data, the non-threaded versions of these utilities can have higher performance when run with some kind of dispatcher on multiple files.

The GNU xargs utility is able to do this, and the relevant features are also in busybox.

papito · on May 12, 2023

GORDON BELL, ADAM! That was a great episode. The most amazing thing to me was how this guy was just messing around, a compression hobbyist, if you will - and then he is being courted by FAANG companies. He just walked into it, almost by accident.

I work in VMWare Fusion on a Mac, in a Mint guest OS, and zipping these huge instances for backup will take forever with a single core. Pigz punishes all 12 cores on my Mac mini and saves me a ton of time.

adamgordonbell · on May 13, 2023

Thanks! Part of the strength of the story is the humility of Yann. He could have presented himself as a genius who was destined for greatness but he's humble enough to look at himself as an everyman.

lrobinovitch · on May 12, 2023

All the episodes of your podcast are excellent. Thank you for making it and keep it up! Just became a Patreon supporter, been meaning to for a while.

adamgordonbell · on May 13, 2023

Thanks for listening and for supporting me!

lionkor · on May 15, 2023

I loved this episode, it was very engaging to the very end, I wish there were more episodes! (I already listened to all of them so far) Thank you for doing this podcast!

MetaWhirledPeas · on May 12, 2023

That was a great read! Very inspiring to hear about a near-middle-age person keeping the flame stoked on a little side hobby, and having it turn into something world-changing. So cool!

adamgordonbell · on May 13, 2023

Thanks for reading!