Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>I would be fine having a separate Unicode string type in the standard library for those instances when you really need Unicode; this design makes the common case much simpler at the expense of making the rare case harder.

Even as a native English speaker, I'm extremely uncomfortable with the idea that we're going to make software even more difficult to internationalize than it already is by using completely separate types for ASCII/Latin1-only text and Unicode.

And it's a whole different level of Anglocentric to portray non-English languages as the "rare" case.



If you give an input box to an American I promise you an emoji will find it's way into it no matter what it's for.


So much this. Thinking that only America and the UK matter is something that was forgivable 40 years ago but not today. It’s even more bizarre because of what you point out - emojis don’t make sense if you consider them as single byte arrays. And lastly, even if you only consider input boxes that don’t accept emojis like names or addresses, you have to remember that America is a nation of immigrants. A lot of folks have names that aren’t going to fit in ASCII.

And this stuff actually matters! In a legal, this-will-cost-us-money kind of way! In 2019 a bank in the EU was penalised because they wouldn’t update a customer’s name to include diacritics (like á, è, ô, ü, ç). Their systems couldn’t support the diacritics because it was built in the 90s with an encoding invented in the 60s. Not their fault but they were still penalised. (https://shkspr.mobi/blog/2021/10/ebcdic-is-incompatible-with...)

It is far more important that strings be utf-8 encoded than they be indexable like arrays. Rust gets this right and I hope future languages will too.


"i must have indexable strings for performance reasons. oh, btw its an electron app"


Unicode has such a rich collection of symbols. I use them frequently in code comments.


such a different level, you could even call it Latincentric :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: