Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Please explain why an invisible zero width "character" is necessary.


if you write كلب which is an arabic word written right to left in the middle of an english sentence, you want to preserve the order of the characters in the stream for computer processing purposes. meaning the chararacter ك must come before the ل and after the e and the space with respect to the memory layout. whereas when displayed, it must be inverted to be legible. the solution is to have an invisible character that indicates a switch in text direction. if you were wondering, the situation where you want to write text in a foreign language within your text is very common outside english speaking countries.


Look I'm writing sdrawkcab (amazingly, I did it without using Unicode!). Layout is the job of your text formatting program. It's easy to fix a text editor to support right-to-left text entry.

The switch in text direction has resulted in malicious code injection attacks, as the reversed text becomes invisible. I had to change my compiler to reject those Unicode characters for that reason. It can be used in other cases to have hidden, malicious text.

Have you checked your SQL code for invisible backwards text that injects malware?


I don't know what "sdrawkcab" means. I'm not a native english speaker, and nothing indicates that it's not a real word or that it is spelled backwards


> Look I'm writing sdrawkcab

How would that work with Text-To-Speech output?


Good question! Two possibilities:

1. Tell the TTS program that the text is RTOL.

2. If the TTS program can speak Arabic, it can detect RTOL Arabic text.

The only purpose for RTOL English I can think of is to insert hidden text for malicious purposes.


how do you search for strings in the text ? how do you search for half the word ? as you do in autocomplete or in that search box in your browser


To prevent ligatures from forming when you need that.


That's the job of a typesetting language.


To mark linewrapping-breakpoints in strings.


Leave typesetting to a proper typesetting language, like Latex.


And how do you call into the typesetting language? Slugging around byte-arrays?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: