Hacker Newsnew | past | comments | ask | show | jobs | submit | more totetsu's commentslogin

FWIW I recently was watching something that i did not realise had been auto translated from Chinese to English. It was kind of a technical topic, but still it seemed perfectly natural. It struck me that .. as much as conflict hawks and clash of culture theorists might want to do their best to construct an enemy, if we get past the disorientation of language barriers, then mostly people are the same. If AI translation can help with that its a benefit.


The Standard Chinese language was always known to be oddly syntactically close to US English. No one calls it an Indo-European language, but they sometimes feel closer together than English and French on surface levels. Japanese is not like that - even human translations between anything to/from Japanese sound translated.


Japanese can especially be tricky to machine-translate because often the subject is missing from a sentence, where it would be required in an equivalent English sentence. The machine translation tends to insert its best guess of a subject (usually "I" or "you"), which can often flip a sentence's meaning inside-out.


I'm thinking there's something deeper or wider than even that. Apparently there's something in Windows 11 right now that says "3 minutes 21 number 2", down to inclusion of the number, because, you know, seconds.

I have my own essay on this matter to post on the Internet, but, to say the least, I don't think this mode of failures happen nearly as often in most other languages, if ever.

1: https://twitter.com/inasoft_ayacy/status/1986739607237722409


There used to be a website called "Translation Party" where you would input a sentence in English and it would auto-translate it to Japanese and back to English over and over until it got to some equilibrium (where the translations were effectively the same) or hit it's upper bound of like 20 ish swings back and forth.

It was a fun little tool, but I think that really drove home for me how different Japanese is from English in how it structures itself.


Yeah, I occasionally tell my Chinese friends that the grammar of English and Chinese is actually pretty similar. If they respond with surprise, I'd say, "Have you ever tried to learn Turkish?" Chinese doesn't have conjugation; but the basic sentence structure (subject-verb-object, preposition, etc) are similar. There are of course loads of differences; but nothing nearly so deeply structural as Turkish, where you indicate relationships between things by adding suffixes; or Japanese, which just has a completely different way of bringing up topics.


Culture is just as much a part of language as the language itself.

There is an air of arrogance in proclaiming that it is merely language barriers that are an issue. But of course it's a convenient argument for big tech forcing MTL on all of us.

But it ultimately marginalizes smaller communities and kills languages. Cultural genocide if you will.

The dangerous thing is that the current state of MTL is serviceable and even usable, but a bilingual speaker will immediately know something is off.

I have noticed this both for French and German, two languages with lots of training material. I imagine it's much much worse for smaller languages and/or communities.

As more and more content on the web is automatically translated, we will all start to talk like translated-from-English LLMs, and that is a future I'm not looking forward to.


These ones? https://github.com/mozilla-japan/translation/wiki/L10N-Guide...

Looking through that wiki there seems to be a lot of things that ML would get wrong.


Interestingly, it looks like some of the "bad examples" are precisely the kind of things ML would produce. Others are what non-native speakers would produce (starting with "あなたは" to translate a sentence starting by "you").

I'm sure some translators were using ML before it was integrated, and those guidelines are here in particular to tell them about those problems.

Also, ML is now really good to translate between European languages, but Japanese is very different in its structure so ML from English to Japanese is not as good. I'm sure some people who only know English/French/Spanish/German saw that ML is pretty good, and don't realize that for some other language it just doesn't work.


> Also, ML is now really good to translate between European languages

As somebody who has to regularly bear "German" machine-translated UIs and manuals that originate in English, I can only say: No, it's not. It's atrocious.


Best one was when gedit had the option to syntax highlight for a language named “Los.”


Not a bad name, to be honest!


lmao. This is a "research team and five years" task with current state of LLM.


Yes, but now that's the answer to questions like "how do we deploy AI without pissing off our communities?"


I have your answer: you provide a tool for your translators, you don't unleash a bot that makes changes left and right and creates new pages.

Typically, when a new page is written in English, don't automatically generate a version in all languages. When a translator starts creating the page in their language, provide a button to pre-fill with ML translation if they want to.

And for users, you can display the English version with a message, "this is not translated in your language yet but you can read an ML version if you want".


This sounds about right, and should be incorporated everywhere.


For reference: https://xkcd.com/1425/


I edited my comment to clarify I hope. Imagining what it could have done wrong and knowing what it did wrong are different.


Do you expect someone who has just watched a bot replace 20 years of their work, with no prior consultation or review, to now write a detailed post about how translations by the bot are not specifically wrong?

The core issue here is the way the bot was deployed. The fact that they had the poor taste to make it auto-replace articles written by their own volunteers is idiotic and disrespectful in the extreme. A new bot should work entirely in the back end, sending proposals for translations to the volunteers, who can choose to accept them or ignore them. Once the rate of acceptance is very high, for a specific individual language, then you might consider automating further.

And yes, this effort needs to be done for each language separately. Just because the bot works well in Italian doesn't in any way guarantee that it will work well in Japanese. Machine translation quality varies wildly by language, this is a well known and obvious fact.


This is Mozilla as usual, arrogant and tone deaf.


So setting up windows is just like what I studied for that redhat cert now.


wow this one looks like it would make a good birthday card layout. https://hrc.contentdm.oclc.org/digital/collection/p15878coll...


Rather than necessarily hateful (but not excluding that), whats happening here seems like brazen discursive manipulation for gain of political power at expense of a minority of the population. Power in Japanese society is in large part built on calculating and self serving behaviour, without any real integral morals or values.. so politicians are seeing this stuff work overseas, and know they can get away with it too now.


Yeah, that’s a really insightful point, and you’ve kind of hit the nail on the head…


You didn't just give a compliment, you forged a symbolic bridge between islands of meaning!


yesterday it told me the "juice wasn't worth the squeeze."


I got a rock.


What happens when we become content for the the push back against AI content content to become mainly AI content?


At this point it's AI discussing with AI about AI. AI is really good at this, it's much easier to keep this discourse going, than to solve deep technical problems with it.


for many years I kept Tiny Travel Tracker running on my android phone, and would periodically import it into os&m maps. It was nice to have the record of all exactly all the places I had wandered to, and not also share that with five eyes. https://f-droid.org/en/packages/com.rareventure.gps2/


Using some pre-memorized number associations helps a lot with this, like peg systems. https://en.wikipedia.org/wiki/Mnemonic_peg_system

personally I do something like 0 egg 1 pen 2 swan 3 butt 4 sailboat .. etc.. and then make a little story for the number using the interaction of those things. This is not for long term memorisations, but just for recall in the mid term.


What are the other digit/noun combinations you use? This seems better thought out than the wikipedia page you linked.


The combinations I use are probably not as good for you as ones you would come up with for yourself. I would suggest just drawing out the numbers 0-9 and looking at the shapes and sketch a few things under each one, that look like that. If need be, rotate or flip the shape, or embellish slightly so long as they are all distinct. They should be easy to imagine objects that can be used flexibly. Salience for memory can come through things that give emotional response, so don't be shy about it.

and then use these to make simple stories like .. 13250 .. pen poked into the butt of a swan who yells grumpily and throws an egg..


Oh my god you just cured my inability to memorize the major memory system. The butt is so intuitive.


Just get the space debris section on the case https://en.wikipedia.org/wiki/Planetes


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: