> It was really funny, Uber switched from MySQL to Postgres, and then Postgres t...

craigkerstiens · on Oct 30, 2020

Guest here. The context is Uber was on MySQL which by default doesn't respect (or at least in their version and setup at the time) case sensitivity. To MySQL craig is the same as Craig when matching.

Their app for search relied on this. The solution for them was the citext data type to create a case insensitive string - https://www.postgresql.org/docs/13/citext.html

I'm fairly familiar with Postgres, but have not heard of a collation that tackles case insensitivity without tweaking the system itself? Postgres you can tweak this at the system level, but I've not seen it in the wild and as per the docs you're now non deterministic. For MySQL it appears you can change this, but seems a surprising default for them they expected case insensitivity.

Yes, another option could have been indexing and searching on lower or upper casing, but they wanted the minimal change to their app.

There was once a conference talk they gave about their migration process from MySQL->Postgres, but I'm not immediately able to find the video online so it may have been removed.

Arnavion · on Oct 30, 2020

I'm not familiar with pg, but the choice of case sensitivity is a fundamental property of collations in other DBMSes like MSSQL. Web search does make it look like pg only got case-insenstivity-via-collation in v12 (2019), and also requires you to make your own collation with case-insensitivity since it doesn't have any built-in ones with it.

aloukissas · on Oct 30, 2020

It's been a while and don't remember the specifics, but when we implemented a POSIX-compliant filesystem at Maginatics (acq. by EMC), we used MySQL (Percona actually) to store the FS logic (inode numbers, filenames, etc) and case sensitivity was one of the FS features. Maybe Percona supports it? I do remember (at least at that time) that full utf-8 support was poor though.

aloukissas · on Oct 30, 2020

Btw, don't get me started about case-sensitive file systems - Windows got it right (case-preserving, but case-insensitive). Why should both /home/ellen/Music and /home/ellen/music exist? :)

viraptor · on Oct 30, 2020

Because once you go there, you have to answer questions which are not as easy. Does M match m? Does i match ı? Does a match あ? Does あ match ア? Does m match 𝓂? What happens when they need to coexist?

aloukissas · on Oct 30, 2020

But they're different! I see where you're coming from though (unfortunately I know all these examples too well..).

gigatexal · on Oct 30, 2020

Case insensitive search is accomplished with ILIKE no?

johannes1234321 · on Oct 31, 2020

How well does ILIKE work with indexes? In MySQL the collation (which includes case- and accent-independency options) is known to the index.

spacemanmatt · on Oct 31, 2020

You can stipulate the collation in the query if you need something non-default.

fuhrysteve · on Oct 30, 2020

Great practical note by Josh Berkus on why Uber left Posgresql. Basically: runaway table bloat because Uber had a usecase that postgres doesn't address as well as InnoDB.

https://www.postgresql.org/message-id/5797D5A1.5030009%40agl...

api · on Oct 30, 2020

The whole VACUUM paradigm is the biggest thing that bugs me about pgsql. The fact that it can actually freeze things always worries me. Can’t this happen constantly in the background like modern GCs?

spacemanmatt · on Oct 31, 2020

I think you might be pleasantly surprised to get current on how vacuum works in current editions. Vacuum used to be a bigger issue for pg.

chousuke · on Oct 30, 2020

It does, though? That's what autovacuum is, unless I'm missing something.

It's just not always enough without tuning, just like GC.

api · on Oct 31, 2020

We have had a lot of problems with frequently updated tables. Is auto tuning really that hard? Or is pg’s philosophy not to “auto” anything?

coding123 · on Oct 30, 2020

I learned that about postgres collations and case sensitivity about 6 years back, and it was within 3 months of my first heavy use of postgres. So I wonder if this person was just experiencing their first usage of postgres too? And I've used relational databases for more than 25 years.

willvarfar · on Oct 30, 2020

I read that as Uber discovered their code relied on MySQL default case insentivity. Migrating to Postgres might have been tricky as collation is a very recent addition in Postgres.

sandinmyjoints · on Oct 30, 2020

That could be. It was the "because the database just doesn’t respect case search" part that struck me -- to say that MySQL doesn't respect case search is false, and the only scenario I can think of where it can even appear to be the case is if you're new to MySQL and you've made a big assumption without even googling it. Or maybe if you're new to string comparison (and therefore programming?) in general.

dainiusse · on Oct 30, 2020

Uber once switched Postgres->MySQL because of human error and polytics for “taking remediation action”.