Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People just cannot do unicode even remotely properly. Just cannot.

𝄞 is one char, not two. привет is matched by \w+.

PS there's some advanced stuff but where is basic [[:posix:]] char classes?



Just to make it clear: It does not even support the basic Latin-1 charset correctly. Matching my family-name requires manual intervention. This is sad.

It seems a very nice regex page otherwise.


Creator here - can you elaborate? What is your family name? The example in this thread ("Grüneis") matches and displays correctly in all the browsers I've tested.

Are you perhaps trying to use a RegEx feature that is not supported by JS? Currently, RegExr only supports the JS flavour of RegEx.


Forget it, I was not used to JavaScript RegEx. I just looked it up on MDN, and it really defines `\w` to be very limited. Doesn't really make it any better, but whatever.


Family name = Grüneis


It doesn't support \p{} either for matching Unicode classes. e.g. \p{Lu} matches uppercase letters (so also Æ and Ö counts).


I couldn't find a way to add the /u or /s flag. There are only allowed /i, /g and /m :(


Creator here - we are currently relying on the JS RegExp API, and thus only support features of that engine, which are somewhat limited. In the future, we may support other flavours. We may also add specific errors for more common features that are not supported, as I've already done for lookbehinds.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: