Jump to content

Is Visual Novel Machine Translation That Bad? A Sanoba Witch Comparison


NowItsAngeTime

Recommended Posts

I have read several visual novels with Deepl. As you have shown 9 times out of 10 it is fairly accurate until you hit gender specific pronouns and some names. You will have to do some mental editing, but it is usable enough to get the story of the game. Also, you can get prior context to what was said or who was being talked about prior to current line. Also, if you have some spoken knowledge that helps as well. Like I said, over all you can get by with MTL unless it has a lot of difficult names, words, or Japanese slang.

Link to comment
Share on other sites

Machine translation really improved a lot since the times of Atlas. Currently I use Textractor with simple Google Translate. I tried Deepl once, but the online translation has a daily limit if you don't wanna pay so I'm not using it.

Currently reading Shin Koihime Kakumei 1 and The major problems I'm seeing are name recognition and genders. I therefore have to use a name replacement list, so Sei at least doesn't end up misinterpreted as a star :mare:. Though that might hurt gender recognition even more and since the default seems to be masculine... with a 99% feminine cast, so that part is a bit of a mess :vinty:. I've also some doubts about certain province names. Lack of context, *J-slang* and complex text sentences can also be problematic, but aren't as frequent as the other stuff.

Some sentences are also completely skipped with only a trailing name remaining - no idea why. Maybe a communication problem with Google Translate? Sometimes it works if I reload the scene, but sometimes it just stays empty. Not sure if this can be avoided with a locally installed translator.

Anyway, while I don't think it's a full replacement for professional tranlations, it can help and speed them up a lot if you're open minded enough to use them. And if you're willing to accept some limitations to be able to read your favorite untranslated VN, it's a pretty nice tool.

Edited by ChaosRaven
Link to comment
Share on other sites

On 4/15/2023 at 6:49 AM, ChaosRaven said:

Machine translation really improved a lot since the times of Atlas. Currently I use Textractor with simple Google Translate. I tried Deepl once, but the online translation has a daily limit if you don't wanna pay so I'm not using it.

I'd recommend Sugoi Translator with the updated model installed.  It runs locally, with output as good as or better than DeepL.  You can read my more detailed write-up on the tool here.

Edited by sanahtlig
Link to comment
Share on other sites

23 hours ago, sanahtlig said:

I'd recommend Sugoi Translator with the updated model installed.  It runs locally, with output as good as or better than DeepL.  You can read my more detailed write-up on the tool here.

When was it updated? I downloaded it about a year ago.

Edited by Erogamer
Link to comment
Share on other sites

1 hour ago, Erogamer said:

When was it updated? I downloaded it about a year ago.

The Offline Model V4.0 is dated Dec 2022.  As far as I know, It's not the default in the current package (Sugoi Translation Toolkit 4.0), so it takes some sleuthing to find and install.

Edited by sanahtlig
Link to comment
Share on other sites

On 4/18/2023 at 11:04 PM, sanahtlig said:

The Offline Model V4.0 is dated Dec 2022.  As far as I know, It's not the default in the current package (Sugoi Translation Toolkit 4.0), so it takes some sleuthing to find and install.

I have version 3 with offline Deepl. Can you tell me where to get V4?

Link to comment
Share on other sites

In general the AI/ML systems require a lot of processing power, and cannot easily parameterised -- do you want to keep honourifics or transform them, set the gender of names, change definitions of words, etc. It requires whole new model retrained every time. The advantage is it is simple to implement and not dependent on language, only need set of matching phrases to train on and enough processing power - implementer do not even need to know either language! Output is based on training data so it can be very close. But it is not easy to modify the model after training to suit specific application. Hence entire specialised models needed e.g. DeepL, Google (general text) Sugoi (JP VNTL only), etc. Also the "hallucination" phenomenon can cause outputs look very correct when incorrect in absence of training data.

Unfortunately syntax-parsing MT has dropped in popularity due to AI hype, although it is very easy to modify and parameterise to adapt to any application. However it requires one know both languages in order to implement and adjust the parsing/transformation rules, but once set up right, so it can give highly accurate 1:1 correspondence translation at high speed with very low processing power. In addition, when the algorithm fails to parse or find appropriate rule to apply, the output will become wrong or left untranslated in a very obvious way.

Link to comment
Share on other sites

14 hours ago, REtransInternational said:

Unfortunately syntax-parsing MT has dropped in popularity due to AI hype, although it is very easy to modify and parameterise to adapt to any application. However it requires one know both languages in order to implement and adjust the parsing/transformation rules, but once set up right, so it can give highly accurate 1:1 correspondence translation at high speed with very low processing power. In addition, when the algorithm fails to parse or find appropriate rule to apply, the output will become wrong or left untranslated in a very obvious way.

For those with no JP knowledge, it's probably best to use a combination of grammar and machine learning based models.  For those with the ability to check the TL accuracy, the machine learning model could be sufficient.  The processing power problem can be solved with GPU-processing.  GPUs are more than capable of handling the load.

The problem with grammar-based MTL is that it's not very accessible to the typical user.  It's less likely to mislead you, but the botched syntax takes more effort to interpret.  Most users prefer a readable TL with camouflaged mistakes over one written in a pseudo-English that has to be painstakingly deciphered.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...