Except when the site in question is completely broken wrt astral codepoints.
Which is unexpectedly common as MySQL's "utf8" can't handle codepoints outside the BMP and will just truncate text at the first astral codepoint[0]. You need MySQL 5.5.3 (because adding a whole new encoding in a minor version makes perfect sense) and "utf8mb4" (because why would a codec called "utf8" actually do UTF8?). And then the regex are probably broken because it's PHP and developers use neither UNICODE mode nor properties (PCRE's "\w" will not match all unicode letters, you need "\p{L}" for that, also note that e.g. "π" is a symbol not a letter, although "πΉ" is a letter)
MySQL is horrible for all the same reasons PHP is horrible, and this applies to Unicode too, except PHP is actually trying to fix its Unicode problems (UTF8 is the default now, moves towards adding a UString class), while MySQL isn't fixing them.