Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But I prefer to avoid Unicode in the console (cf. terminal, I do not use a graphics layer plus emulator).

Why? If the alternative is garbled text, is that what you choose?



"Why?"

Why not?

"If the alternative is garbled text, is that what you choose?"

No. The alternative is not garbled text and thats not what I choose.

The alternative for me is a subset of ASCII. I choose what characters I will accept, delete the rest.

For example, something like

   tr -cd '[\12\40-\176]'
This has worked for me for several decades. Nor am I the only one who uses this approach. I once saw an HN commenter say their favourite regex was

   tr -cd '[ -~]'


That’s fair. Is that what links without Unicode support does? (Ignore all byte sequences it doesn’t recognize.) Also, I’d still love to know why you prefer stripping out non-ASCII characters — does this sentence become more readable to you with the em dash omitted?


It just simplfies things for me. If I can read text without Unicode, then I dont need it. Its one less variable I need to worry about. Maybe another way to look at it is cost-benefit analysis. I just dont get much beneft from Unicode in the console (I'm usually just reading text) whereas it almost always causes problems from time to time.

I can see a dash in 7-bit ASCII. I am not going to lose the meaning of a sentence by forgoing a few Unicode chaacters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: