>I would be fine having a separate Unicode string type in the standard library f...

pests · on Nov 27, 2023

If you give an input box to an American I promise you an emoji will find it's way into it no matter what it's for.

nindalf · on Nov 27, 2023

So much this. Thinking that only America and the UK matter is something that was forgivable 40 years ago but not today. It’s even more bizarre because of what you point out - emojis don’t make sense if you consider them as single byte arrays. And lastly, even if you only consider input boxes that don’t accept emojis like names or addresses, you have to remember that America is a nation of immigrants. A lot of folks have names that aren’t going to fit in ASCII.

And this stuff actually matters! In a legal, this-will-cost-us-money kind of way! In 2019 a bank in the EU was penalised because they wouldn’t update a customer’s name to include diacritics (like á, è, ô, ü, ç). Their systems couldn’t support the diacritics because it was built in the 90s with an encoding invented in the 60s. Not their fault but they were still penalised. (https://shkspr.mobi/blog/2021/10/ebcdic-is-incompatible-with...)

It is far more important that strings be utf-8 encoded than they be indexable like arrays. Rust gets this right and I hope future languages will too.

pests · on Nov 27, 2023

"i must have indexable strings for performance reasons. oh, btw its an electron app"

matheusmoreira · on Nov 27, 2023

Unicode has such a rich collection of symbols. I use them frequently in code comments.

PaulDavisThe1st · on Nov 27, 2023

such a different level, you could even call it Latincentric :)