Rpapo's Translation Assistant.

This forum is for Games & Computing related discussion

Moderators: Fringe Security Bureau, Senior Editors, Senior Translators, Alt. Language Translator/Editor, Executive Council, Project Translators, Project Editors

Locked
User avatar
Mystes
Heaven's Blade Successor
Posts: 15932
Joined: Thu Aug 05, 2010 6:54 am
Favourite Light Novel:
Contact:

Re: Rpapo's Translation Assistant.

Post by Mystes »

rpapo wrote:Oops. Something's not working as well as before. Going to have to look into it . . . :oops:
Don't worry, take your time. XD
Kira0802

#campione at rizon for some #campione discussions~~ And other stuffs.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Kira0802 wrote:
rpapo wrote:Oops. Something's not working as well as before. Going to have to look into it . . . :oops:
Don't worry, take your time. XD
The problem I saw turns out to not be as serious as I thought. The parser has it's weaknesses still, and the biggest one is a relatively low tolerance for less-than-perfect spelling. I had mis-transcribed ヅラ as ジラ, and it got all confused.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

I've posted an update to the program which punts around the problem mentioned above. Now, if we find a sequence of katakana characters, and we cannot find anything matching it in the dictionary or it's extensions, then we gather those characters together and call it a word with an unknown meaning. Most (though not all) such words are actually borrowed from English.

By making this change, the misbehavior I was seeing goes away . . . at least for now.

This time around, strangely enough, the culprit was ツン (tsun), which was in the dictionary in hiragana format, but not in katakana.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

An update on this. I finally kicked myself hard and did some work on an interactive version of my Japanese translation assistant program. This is what it looks like now:

Image

The dictionary file must have been built first using the NIHONGO.exe program, but once you have that, the above program may be run. Once it has fully started, you type or paste the text you want to analyze into the upper entry field, then click on the Translate button. The results appear in the lower window.

This is very much a work in progress. I did what you see above for the sake of helping me with manga translations, where I don't bother to make a TXT transcript file. Most of my time, however, is spent on transcribing and translating the Golden Time books. Witness the fact that this forum topic hasn't been posted too in over six months...

Things to be done (no ETA, so don't ask for one):
(1) Provide for dictionary maintenance from the above GUI.
(2) Provide for user hints (split a word here, join these two words...).
(3) Make the parser less reliant on the dictionary having every possible verb conjugation already precomputed.
(4) Make the parser smarter.

Making it into a real translator is a whole different can of worms...
Last edited by rpapo on Sun Mar 17, 2013 12:16 pm, edited 1 time in total.
Reason: Grammer error.
User avatar
didntloginD:
Astral Realm

Re: Rpapo's Translation Assistant.

Post by didntloginD: »

rpapo wrote:Things to be done (no ETA, so don't ask for one):
(1) Provide for dictionary maintenance from the above GUI.
(2) Provide for user hints (split a word here, join these two words...).
(3) Make the parser less reliant on the dictionary having every possible verb conjugation already precomputed.
(4) Make the parser smarter.
:lol:

My biggest question is that of overhead: is there any way for you to make it so that one doesn't have to have as much as is required right now? IIRC, one has to load the entire dictionary and index into RAM in order to access it through the program itself. That's probably the biggest "improvement" I could think of. I don't know what all the WWDIJ(?)(?)(?) (I can never remember the entire acronym) file gives you access to in terms of interaction with the dictionary they have generated. I assume that the WWDIJ(?)(?)(?) thing you're loading when you say to load the dictionary beforehand. (Unless this is all wrong and I'm just going on about nothing.) Either way, it'd be nice to slim down the amount of overhead some.

I sit around doing nothing nowadays too much anyways, feel free to PM me if you want to expand it / bounce ideas back and forth. I am quite interested in this thing (and the overhead taken with my WinXP 32bit system :roll: ).
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

didntloginD: wrote:My biggest question is that of overhead: is there any way for you to make it so that one doesn't have to have as much as is required right now? IIRC, one has to load the entire dictionary and index into RAM in order to access it through the program itself. That's probably the biggest "improvement" I could think of. I don't know what all the WWDIJ(?)(?)(?) (I can never remember the entire acronym) file gives you access to in terms of interaction with the dictionary they have generated. I assume that the WWDIJ(?)(?)(?) thing you're loading when you say to load the dictionary beforehand. (Unless this is all wrong and I'm just going on about nothing.) Either way, it'd be nice to slim down the amount of overhead some.

I sit around doing nothing nowadays too much anyways, feel free to PM me if you want to expand it / bounce ideas back and forth. I am quite interested in this thing (and the overhead taken with my WinXP 32bit system :roll: ).
You are correct in remembering that the program loads the whole dictionary before getting to work. But it's actually much worse than simply loading all of WWWJDICT. That part is easy, with "only" about 160,000 entries. The problem is that my current quite dumb parser relies on the dictionary being pre-processed with tons of verb and adjective conjugations, which swells the basic EDICT (English Dictionary) file from 13Mb to an indexed binary image that currently takes almost 1.4Gb. The dictionary expands roughly 100 times in size.

I know how I want to get around the problem, but it will take some time to do it right. Instead of having a dumb parser that relies on longest matches against a huge pre-processed dictionary (with a few minor optimizations), I need to make a smart parser that evaluates how Japanese words conjugate and relate to each other dynamically. I have two partially attempted prototypes (Analyzer, Parser2) for that in the code package I publish:

http://home.comcast.net/~rpapo/Nihongo.zip

Those prototypes are far from ready for anybody else's evaluation, though, and I haven't spent time on them in quite a while. By the time I get back to them, I may simply start yet a third new project...
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

OMG...
This is a really nice program you've made here.

It will spare me tons of time when translating ^^''
(Because I'm terrible with kanji so far... :oops: )

I'm going to try it asap. :mrgreen:

Edit : heck, unable to get it to work: the Nihongo.exe tells me "no source file specfied" and so it doesn't build the dictionary. (On Win7x64)
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Lery wrote:OMG...
This is a really nice program you've made here.

It will spare me tons of time when translating ^^''
(Because I'm terrible with kanji so far... :oops: )

I'm going to try it asap. :mrgreen:

Edit : heck, unable to get it to work: the Nihongo.exe tells me "no source file specfied" and so it doesn't build the dictionary. (On Win7x64)
In your case, do the following steps:

(1) Open a command prompt. Change (CD) to the directory you extracted my entire package to.
(2) Create a Unicode TXT file with nothing in it, and save it to that directory. Let's suppose you called it TEST.TXT.
(3) Execute "x64\release\nihongo.exe TEST.txt OUTPUT.txt 0 9999". This will take a while, and consume monstrous amounts of memory (4-5Gb), but will create a dictionary file that can be loaded quickly the next time you run.
(4) Now that the dictionary DICTIONARY.DAT has been created, you can run "x64\release\Honyaku_no_Hojo.exe"

I have not yet integrated it all into Honyaku.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

Code: Select all

>>C:\Nihongo\x64\release>Nihongo.exe test.txt output.txt 0 9999
Loading dictionary.
ERROR: Unable to open dictionary file 'Dictionary.dat' for reading.
Building dictionary file from EDICT.
ERROR: Unable to open source file.  Error 2:(null)
:?
Still seems to be a problem.
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Lery wrote:Still seems to be a problem.
But it looks like it got the dictionary built. Check for the DICTIONARY.DAT. It should exist, and it should be quite big. There will also be a file DICTIONARY.TXT that shows the entire dictionary in human readable form. Beware: the file is so large most editors cannot swallow it whole.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

Doesn't look like it's the case :

Code: Select all

>>C:\Nihongo\x64\release>dir /B
Honyaku_No_Hojo.exe
Juman.dll
Nihongo.exe
test.txt

>>C:\Nihongo\x64\release>Nihongo.exe test.txt output.txt 0 9999
Loading dictionary.
ERROR: Unable to open dictionary file 'Dictionary.dat' for reading.
Building dictionary file from EDICT.
ERROR: Unable to open source file.  Error 2:(null)

>>C:\Nihongo\x64\release>dir /B
Honyaku_No_Hojo.exe
Juman.dll
Nihongo.exe
test.txt
It seems to be unable to connect to the EDICT-thing. :?
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Run from c:\Nihongo, not from c:\Nihongo\x64\Release. The program is looking for specific files relative to the project home directory, which in your case is c:\Nihongo.

My earlier instructions stated that...
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

My bad, the habit... :oops:
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Likewise, when you run Honyaku, do so from c:\Nihongo.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

Mhhh.

Code: Select all

>>C:\Nihongo>x64\release\Nihongo.exe test.txt output.txt 0 9999
Loading dictionary.
ERROR: Unable to open dictionary file 'Dictionary.dat' for reading.
Building dictionary file from EDICT.
  167018 dictionary entries.
Loading additional words.
ERROR: Unable to open file 'AddedWords.txt'.
Dumping dictionary to text file.
Building word/phrase index.
  6851438 index entries.
Saving dictionary.
ERROR: Unable to delete old dictionary file 'Dictionary.dat'.
Processing document.
ERROR: Invalid source file 'test.txt'.
Is that normal ?? There isn't any Dictionnary.dat in my folder, but a Dictionary.txt appeared. :wink:

Mhhh, Honyaku isn't working like that... So I guess the Dictionary.dat is required.
But if i create an empty file called "Dictionary.dat" then it still does the same error : "Unable to delete old dictionary file 'Dictionary.dat'."
Wiki user : Lery (talk)

Sysadmin, sometimes.
Locked

Return to “Games & Computing”