Rpapo's Translation Assistant.

This forum is for Games & Computing related discussion

Moderators: Fringe Security Bureau, Senior Editors, Senior Translators, Alt. Language Translator/Editor, Executive Council, Project Translators, Project Editors

Locked
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Lery wrote:Is that normal ?? There isn't any Dictionary.dat in my folder, but a Dictionary.txt appeared. :wink:

Mhhh, Honyaku ist not working... So I guess the Dictionary.dat is required.
I smell a bug for new installations. I'll have a fix shortly.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

Okay. Thank you for your reactivity. :D
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Lery wrote:Okay. Thank you for your reactivity. :D
Reactivity? In normal English, it would be "Thank you for your quick response."

Anyway, I've updated the package. Please clean out your current c:\Nihongo directory and unpack the new ZIP there. Create a Unicode TXT file with something in it, "TEST.TXT". To do so, use Notepad, type something, and then Save As, selecting the Unicode format option.

Then, from c:\Nihongo, execute the command "x64\release\nihongo test.txt output.txt 0 9999". Wait for it to finish. Afterwards, the files DICTIONARY.DAT and DICTIONARY.TXT should exist in c:\Nihongo. You should now be able to execute "x64\Release\Honyaku_no_Hojo.exe".

Code: Select all

C:\Nihongo>notepad test.txt

C:\Nihongo>x64\release\nihongo test.txt output.txt 0 9999
Loading dictionary.
ERROR: Unable to open dictionary file 'Dictionary.dat' for reading.
Building dictionary file from EDICT.
  167018 dictionary entries.
Loading additional words.
Dumping dictionary to text file.
Building word/phrase index.
  6863153 index entries.
Saving dictionary.
  Dictionary save complete.
Processing document.
ERROR: Invalid source file 'test.txt'.

C:\Nihongo>dir Dictionary.*
 Volume in drive C is OS
 Volume Serial Number is 6E20-23D5

 Directory of C:\Nihongo

05/25/2013  05:41 PM     1,379,268,158 Dictionary.dat
05/25/2013  05:40 PM       450,450,332 Dictionary.txt
               2 File(s)  1,829,718,490 bytes
               0 Dir(s)  486,631,878,656 bytes free

C:\Nihongo>x64\release\Honyaku_no_Hojo.exe
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

To anybody lurking here and reading all this: don't try this unless you are running Windows X64 (preferably Windows 7 X64), with at least 6GB of memory installed on your machine. Doing so with less RAM may work, but it sure won't work very well and will certainly run slowly.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

It used almost 2.85Go of RAM to build it, but it worked ^^

Code: Select all

C:\Nihongo>x64\release\Nihongo.exe test.txt output.txt 0 9999
Loading dictionary.
ERROR: Unable to open dictionary file 'Dictionary.dat' for reading.
Building dictionary file from EDICT.
  167018 dictionary entries.
Loading additional words.
Dumping dictionary to text file.
Building word/phrase index.
  6863153 index entries.
Saving dictionary.
  Dictionary save complete.
Processing document.
ERROR: Invalid source file 'test.txt'.
And now the Honyaku application is working. :D

Thank you very much, it's a very nice program, very useful.
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

In the current code, NIHONGO.EXE has the logic to create a new dictionary file if it cannot load or find a usable one. It does so by reading the EDICT file from WWWJDICT, conjugating that can be conjugated in order to fill the dictionary with more usable matches, and then saving the whole thing as a memory image (DICTIONARY.DAT). That file, DICTIONARY.DAT, can be loaded quickly, and does not consume anywhere near as much memory for a normal run of either NIHONGO.EXE or HONYAKU_NO_HOJO.EXE. It also loads quickly. Before I built that logic, the dictionary was being constructed from scratch for every run, which cost quite a bit of time, of course.

Anyway, the current logic is rather simple. Given a piece of Japanese text, it finds the longest match it can in the dictionary file. That longest match might be multiple matches, in which case every match is listed in the output. It then skips past the matched section and tries again with the following text. It does this until it runs out of text. One major exception to the processing: it favors particles under certain circumstances. Unfortunately, that means that certain words, like "にして", get parsed wrong. かもしれない is another example where it gets handled incorrectly.

One hint: with the Honyako no Hojo program, it is possible, when you see it parsing incorrectly, to insert spaces into the Japanese text to force word breaks. Unfortunately, you cannot force characters to be considered together.

One of these days, I intend to make the parser smarter, evaluating conjugations dynamically and taking into account the allowable relationships between words. Right now, I limit the number of conjugations inserted into the dictionary in order to restrict the dictionary size somewhat. Unfortunately, that means that my program does not detect certain complex conjugations.

I have no intention, at this point, of making it into a full translator. It is a translator's assistant, providing a buffet of information for the translator to select from in doing his work.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

With yesterday's update, I added another file, "AddedWords.txt", which serves to extend the basic EDICT dictionary. It is a Unicode TXT file containing CSV (comma-separated-value) data arranged like this:

Code: Select all

Word,Sound,Description,Conjugate(true/false)
"とらドラ","とらドラ","(title) Toradora"
"竹宮","たけみや","(surname) Takemiya"
"ゆゆこ","ゆゆこ","(name) Yuyuko"
"嚙む","かむ","(v5m,vt) (1) to bite/to chew/to gnaw/(2) to fumble or falter with one's words/(P)/",true
"ドス黒い","ドスぐろい","(adj-i) darkish/dusky/"
The first column is the Kanji version of a word. If there is no such thing, then it can be in kana. In most cases, a word will start with Kanji, and then be extended to a dictionary form with some hiragana. But there are other variations, like words that start with katakana, continue with kanji, and end in hiragana. Some examples are given above.

The second column is the kana pronunciation of the word. It can be in hiragana or katakana, or a combination thereof. Kanji must not be present in this column.

The third column is the word's definition. It follows the format for EDICT entries, with dictionary codes contained in parentheses. If you want to know how to set up these definitions, I refer you to the WWWJDICT web site, at http://www.csse.monash.edu.au/~jwb/cgi- ... dic.cgi?1C. The annotations are important if you are dealing with verbs or adjectives that you want to see conjugated in the dictionary.

The fourth column contains the word "true" or "false". If you specify "true" there, then the entry will be conjugated and the different conjugations will be inserted into the dictionary along with the original entry.

If you conjugate an entry, for instance the "嚙む" entry in the example above, then the dictionary will gain a list of conjugations like this:

Code: Select all

(v5m,vt) (1) to bite/to chew/to gnaw/(2) to fumble or falter with one's words/(P)/
  嚙まず (かまず) [Negative Conjunctive] 
  嚙ませる (かませる) [Causative] 
  嚙まな (かまな) [Plain Negative (1), Stem] 
  嚙まない (かまない) [Plain Negative (1)] 
  嚙まなかった (かまなかった) [Plain Negative Past] 
  嚙まなかったら (かまなかったら) [Plain Negative Conditional] 
  嚙まなかったり (かまなかったり) [Plain Negative (1), Plain Positive Alternative] 
  嚙まなかろう (かまなかろう) [Plain Negative (1), Plain Presumptive (1)] 
  嚙まなきゃ (かまなきゃ) [Plain Negative Provisional (2)] 
  嚙まなく (かまなく) [Plain Negative (1), Adverb] 
  嚙まなくさせる (かまなくさせる) [Plain Negative (1), Causative] 
  嚙まなくて (かまなくて) [Plain Negative (1), Plain Conjunctive] 
  嚙まなくていた (かまなくていた) [Plain Negative (1), Plain Progressive Past (1)] 
  嚙まなくていて (かまなくていて) [Plain Negative (1), Plain Progressive Conjunctive (1)] 
  嚙まなくていました (かまなくていました) [Plain Negative (1), Polite Progressive Past] 
  嚙まなくています (かまなくています) [Plain Negative (1), Polite Progressive] 
  嚙まなくている (かまなくている) [Plain Negative (1), Plain Progressive] 
  嚙まなくてた (かまなくてた) [Plain Negative (1), Plain Progressive Past (2)] 
  嚙まなくてて (かまなくてて) [Plain Negative (1), Plain Progressive Conjunctive (2)] 
  嚙まなくてる (かまなくてる) [Plain Negative (1), Plain Progressive Casual] 
  嚙まなければ (かまなければ) [Plain Negative Provisional (1)] 
  嚙まなさ (かまなさ) [Plain Negative (1), Noun (from adjective)] 
  嚙まなそう (かまなそう) [Plain Negative (1), Seems (from adjective)] 
  嚙まなまして (かまなまして) [Plain Negative (1), Polite Positive Conjunctive] 
  嚙まなませんで (かまなませんで) [Plain Negative (1), Polite Negative Conjunctive] 
  嚙まぬ (かまぬ) [Plain Negative (2)] 
  嚙まれ (かまれ) [Plain Passive, Stem] 
  嚙まれさせる (かまれさせる) [Plain Passive, Causative] 
  嚙まれず (かまれず) [Plain Passive, Negative Conjunctive] 
  嚙まれた (かまれた) [Plain Passive, Plain Positive Past] 
  嚙まれた (かまれた) [Plain Passive, Plain Wish, Stem] 
  嚙まれたい (かまれたい) [Plain Passive, Plain Wish] 
  嚙まれたかった (かまれたかった) [Plain Passive, Plain Wish, Plain Positive Past] 
  嚙まれたかったら (かまれたかったら) [Plain Passive, Plain Wish, Plain Positive Conditional] 
  嚙まれたかったり (かまれたかったり) [Plain Passive, Plain Wish, Plain Positive Alternative] 
  嚙まれたかろう (かまれたかろう) [Plain Passive, Plain Wish, Plain Presumptive (1)] 
  嚙まれたく (かまれたく) [Plain Passive, Plain Wish, Adverb] 
  嚙まれたくさせる (かまれたくさせる) [Plain Passive, Plain Wish, Causative] 
  嚙まれたくて (かまれたくて) [Plain Passive, Plain Wish, Plain Conjunctive] 
  嚙まれたくていた (かまれたくていた) [Plain Passive, Plain Wish, Plain Progressive Past (1)] 
  嚙まれたくていて (かまれたくていて) [Plain Passive, Plain Wish, Plain Progressive Conjunctive (1)] 
  嚙まれたくていました (かまれたくていました) [Plain Passive, Plain Wish, Polite Progressive Past] 
  嚙まれたくています (かまれたくています) [Plain Passive, Plain Wish, Polite Progressive] 
  嚙まれたくている (かまれたくている) [Plain Passive, Plain Wish, Plain Progressive] 
  嚙まれたくてた (かまれたくてた) [Plain Passive, Plain Wish, Plain Progressive Past (2)] 
  嚙まれたくてて (かまれたくてて) [Plain Passive, Plain Wish, Plain Progressive Conjunctive (2)] 
  嚙まれたくてる (かまれたくてる) [Plain Passive, Plain Wish, Plain Progressive Casual] 
  嚙まれたくない (かまれたくない) [Plain Passive, Plain Wish, Plain Negative (1)] 
  嚙まれたさ (かまれたさ) [Plain Passive, Plain Wish, Noun (from adjective)] 
  嚙まれたそう (かまれたそう) [Plain Passive, Plain Wish, Seems (from adjective)] 
  嚙まれたまして (かまれたまして) [Plain Passive, Plain Wish, Polite Positive Conjunctive] 
  嚙まれたませんで (かまれたませんで) [Plain Passive, Plain Wish, Polite Negative Conjunctive] 
  嚙まれたら (かまれたら) [Plain Passive, Plain Positive Conditional] 
  嚙まれたり (かまれたり) [Plain Passive, Plain Positive Alternative] 
  嚙まれて (かまれて) [Plain Passive, Plain Conjunctive] 
  嚙まれていた (かまれていた) [Plain Passive, Plain Progressive Past (1)] 
  嚙まれていて (かまれていて) [Plain Passive, Plain Progressive Conjunctive (1)] 
  嚙まれていました (かまれていました) [Plain Passive, Polite Progressive Past] 
  嚙まれています (かまれています) [Plain Passive, Polite Progressive] 
  嚙まれている (かまれている) [Plain Passive, Plain Progressive] 
  嚙まれてた (かまれてた) [Plain Passive, Plain Progressive Past (2)] 
  嚙まれてて (かまれてて) [Plain Passive, Plain Progressive Conjunctive (2)] 
  嚙まれてる (かまれてる) [Plain Passive, Plain Progressive Casual] 
  嚙まれな (かまれな) [Plain Passive, Plain Negative (1), Stem] 
  嚙まれない (かまれない) [Plain Passive, Plain Negative (1)] 
  嚙まれなかった (かまれなかった) [Plain Passive, Plain Negative Past] 
  嚙まれなかったら (かまれなかったら) [Plain Passive, Plain Negative Conditional] 
  嚙まれなかったり (かまれなかったり) [Plain Passive, Plain Negative (1), Plain Positive Alternative] 
  嚙まれなかろう (かまれなかろう) [Plain Passive, Plain Negative (1), Plain Presumptive (1)] 
  嚙まれなきゃ (かまれなきゃ) [Plain Passive, Plain Negative Provisional (2)] 
  嚙まれなく (かまれなく) [Plain Passive, Plain Negative (1), Adverb] 
  嚙まれなくさせる (かまれなくさせる) [Plain Passive, Plain Negative (1), Causative] 
  嚙まれなくて (かまれなくて) [Plain Passive, Plain Negative (1), Plain Conjunctive] 
  嚙まれなくていた (かまれなくていた) [Plain Passive, Plain Negative (1), Plain Progressive Past (1)] 
  嚙まれなくていて (かまれなくていて) [Plain Passive, Plain Negative (1), Plain Progressive Conjunctive (1)] 
  嚙まれなくていました (かまれなくていました) [Plain Passive, Plain Negative (1), Polite Progressive Past] 
  嚙まれなくています (かまれなくています) [Plain Passive, Plain Negative (1), Polite Progressive] 
  嚙まれなくている (かまれなくている) [Plain Passive, Plain Negative (1), Plain Progressive] 
  嚙まれなくてた (かまれなくてた) [Plain Passive, Plain Negative (1), Plain Progressive Past (2)] 
  嚙まれなくてて (かまれなくてて) [Plain Passive, Plain Negative (1), Plain Progressive Conjunctive (2)] 
  嚙まれなくてる (かまれなくてる) [Plain Passive, Plain Negative (1), Plain Progressive Casual] 
  嚙まれなければ (かまれなければ) [Plain Passive, Plain Negative Provisional (1)] 
  嚙まれなさ (かまれなさ) [Plain Passive, Plain Negative (1), Noun (from adjective)] 
  嚙まれなそう (かまれなそう) [Plain Passive, Plain Negative (1), Seems (from adjective)] 
  嚙まれなまして (かまれなまして) [Plain Passive, Plain Negative (1), Polite Positive Conjunctive] 
  嚙まれなませんで (かまれなませんで) [Plain Passive, Plain Negative (1), Polite Negative Conjunctive] 
  嚙まれぬ (かまれぬ) [Plain Passive, Plain Negative (2)] 
  嚙まれまい (かまれまい) [Plain Passive, Plain Negative Presumptive] 
  嚙まれました (かまれました) [Plain Passive, Polite Positive Past] 
  嚙まれましたら (かまれましたら) [Plain Passive, Polite Positive Conditional] 
  嚙まれましたり (かまれましたり) [Plain Passive, Polite Positive Alternative] 
  嚙まれまして (かまれまして) [Plain Passive, Polite Positive Conjunctive] 
  嚙まれましょう (かまれましょう) [Plain Passive, Polite Presumptive] 
  嚙まれます (かまれます) [Plain Passive, Polite Positive] 
  嚙まれません (かまれません) [Plain Passive, Polite Negative] 
  嚙まれませんで (かまれませんで) [Plain Passive, Polite Negative Conjunctive] 
  嚙まれませんでした (かまれませんでした) [Plain Passive, Polite Negative Past] 
  嚙まれませんでしたら (かまれませんでしたら) [Plain Passive, Polite Negative Conditional] 
  嚙まれませんでしたり (かまれませんでしたり) [Plain Passive, Polite Negative Alternative] 
  嚙まれよ (かまれよ) [Plain Passive, Plain Imperative (2)] 
  嚙まれよう (かまれよう) [Plain Passive, Plain Presumptive (1)] 
  嚙まれられる (かまれられる) [Plain Passive, Potential (1)] 
  嚙まれる (かまれる) [Plain Passive] 
  嚙まれれば (かまれれば) [Plain Passive, Plain Positive Provisional] 
  嚙まれれる (かまれれる) [Plain Passive, Potential (2)] 
  嚙まれろ (かまれろ) [Plain Passive, Plain Imperative (1)] 
  嚙み (かみ) [Stem] 
  嚙みた (かみた) [Plain Wish, Stem] 
  嚙みたい (かみたい) [Plain Wish] 
  嚙みたかった (かみたかった) [Plain Wish, Plain Positive Past] 
  嚙みたかったら (かみたかったら) [Plain Wish, Plain Positive Conditional] 
  嚙みたかったり (かみたかったり) [Plain Wish, Plain Positive Alternative] 
  嚙みたかろう (かみたかろう) [Plain Wish, Plain Presumptive (1)] 
  嚙みたく (かみたく) [Plain Wish, Adverb] 
  嚙みたくさせる (かみたくさせる) [Plain Wish, Causative] 
  嚙みたくて (かみたくて) [Plain Wish, Plain Conjunctive] 
  嚙みたくていた (かみたくていた) [Plain Wish, Plain Progressive Past (1)] 
  嚙みたくていて (かみたくていて) [Plain Wish, Plain Progressive Conjunctive (1)] 
  嚙みたくていました (かみたくていました) [Plain Wish, Polite Progressive Past] 
  嚙みたくています (かみたくています) [Plain Wish, Polite Progressive] 
  嚙みたくている (かみたくている) [Plain Wish, Plain Progressive] 
  嚙みたくてた (かみたくてた) [Plain Wish, Plain Progressive Past (2)] 
  嚙みたくてて (かみたくてて) [Plain Wish, Plain Progressive Conjunctive (2)] 
  嚙みたくてる (かみたくてる) [Plain Wish, Plain Progressive Casual] 
  嚙みたくな (かみたくな) [Plain Wish, Plain Negative (1), Stem] 
  嚙みたくない (かみたくない) [Plain Wish, Plain Negative (1)] 
  嚙みたくなかった (かみたくなかった) [Plain Wish, Plain Negative Past] 
  嚙みたくなかったら (かみたくなかったら) [Plain Wish, Plain Negative Conditional] 
  嚙みたくなかったり (かみたくなかったり) [Plain Wish, Plain Negative (1), Plain Positive Alternative] 
  嚙みたくなかろう (かみたくなかろう) [Plain Wish, Plain Negative (1), Plain Presumptive (1)] 
  嚙みたくなく (かみたくなく) [Plain Wish, Plain Negative (1), Adverb] 
  嚙みたくなくさせる (かみたくなくさせる) [Plain Wish, Plain Negative (1), Causative] 
  嚙みたくなくて (かみたくなくて) [Plain Wish, Plain Negative (1), Plain Conjunctive] 
  嚙みたくなくていた (かみたくなくていた) [Plain Wish, Plain Negative (1), Plain Progressive Past (1)] 
  嚙みたくなくていて (かみたくなくていて) [Plain Wish, Plain Negative (1), Plain Progressive Conjunctive (1)] 
  嚙みたくなくていました (かみたくなくていました) [Plain Wish, Plain Negative (1), Polite Progressive Past] 
  嚙みたくなくています (かみたくなくています) [Plain Wish, Plain Negative (1), Polite Progressive] 
  嚙みたくなくている (かみたくなくている) [Plain Wish, Plain Negative (1), Plain Progressive] 
  嚙みたくなくてた (かみたくなくてた) [Plain Wish, Plain Negative (1), Plain Progressive Past (2)] 
  嚙みたくなくてて (かみたくなくてて) [Plain Wish, Plain Negative (1), Plain Progressive Conjunctive (2)] 
  嚙みたくなくてる (かみたくなくてる) [Plain Wish, Plain Negative (1), Plain Progressive Casual] 
  嚙みたくなさ (かみたくなさ) [Plain Wish, Plain Negative (1), Noun (from adjective)] 
  嚙みたくなそう (かみたくなそう) [Plain Wish, Plain Negative (1), Seems (from adjective)] 
  嚙みたくなまして (かみたくなまして) [Plain Wish, Plain Negative (1), Polite Positive Conjunctive] 
  嚙みたくなませんで (かみたくなませんで) [Plain Wish, Plain Negative (1), Polite Negative Conjunctive] 
  嚙みたさ (かみたさ) [Plain Wish, Noun (from adjective)] 
  嚙みたそう (かみたそう) [Plain Wish, Seems (from adjective)] 
  嚙みたまして (かみたまして) [Plain Wish, Polite Positive Conjunctive] 
  嚙みたませんで (かみたませんで) [Plain Wish, Polite Negative Conjunctive] 
  嚙みました (かみました) [Polite Positive Past] 
  嚙みましたら (かみましたら) [Polite Positive Conditional] 
  嚙みましたり (かみましたり) [Polite Positive Alternative] 
  嚙みまして (かみまして) [Polite Positive Conjunctive] 
  嚙みましょう (かみましょう) [Polite Presumptive] 
  嚙みます (かみます) [Polite Positive] 
  嚙みません (かみません) [Polite Negative] 
  嚙みませんで (かみませんで) [Polite Negative Conjunctive] 
  嚙みませんでした (かみませんでした) [Polite Negative Past] 
  嚙みませんでしたら (かみませんでしたら) [Polite Negative Conditional] 
  嚙みませんでしたり (かみませんでしたり) [Polite Negative Alternative] 
  嚙む (かむ) 
  嚙む (かむ) [Plain Positive] 
  嚙むまい (かむまい) [Plain Negative Presumptive] 
  嚙め (かめ) [Plain Imperative (1)] 
  嚙めば (かめば) [Plain Positive Provisional] 
  嚙める (かめる) [Potential (1)] 
  嚙もう (かもう) [Plain Presumptive (1)] 
  嚙んだ (かんだ) [Plain Positive Past] 
  嚙んだら (かんだら) [Plain Positive Conditional] 
  嚙んだり (かんだり) [Plain Positive Alternative] 
  嚙んで (かんで) [Plain Conjunctive] 
  嚙んでいた (かんでいた) [Plain Progressive Past (1)] 
  嚙んでいて (かんでいて) [Plain Progressive Conjunctive (1)] 
  嚙んでいました (かんでいました) [Polite Progressive Past] 
  嚙んでいます (かんでいます) [Polite Progressive] 
  嚙んでいる (かんでいる) [Plain Progressive] 
  嚙んでた (かんでた) [Plain Progressive Past (2)] 
  嚙んでて (かんでて) [Plain Progressive Conjunctive (2)] 
  嚙んでる (かんでる) [Plain Progressive Casual] 
The main usefulness of this file, however, is simply to add proper names to your working dictionary.

I haven't cleaned out "AddedWords.txt" in a while. Where possible, I have submitted new definitions to WWWJDICT, and so some of the entries currently in AddedWords.txt are now redundant.
KuroiHikari
Fish Miner
Posts: 822
Joined: Fri Apr 16, 2010 1:01 am
Favourite Light Novel:

Re: Rpapo's Translation Assistant.

Post by KuroiHikari »

Have you checked out how mecab recognizes part of speech? It seems to work quite well.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

Mhhh, very interesting !
There is a huge potential, for sure.

And now I understand better the size of the dictionary file, given the number of possible conjugations it adds for every verb. :shock:

@KuroiHikari : Mecab seems to be an interesting project as-well, I'll keep an eye on it too. :) I'm curious about the possible integration of the N-Best algorithm.
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Mecab looks like something worth mining for ideas. If you've looked at the code for my program, you will find that I included the JUMAN parser for generating alternative analyses for comparison. The problem with JUMAN was the relatively limited dictionary compared to using EDICT.

Anyway, I continue to try and concentrate on building up my mental parser rather than working a lot on my computer parser. I have translation work to do, and people awaiting the results, both here at Baka-Tsuki, and over in the scanlation world too.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Lery wrote:And now I understand better the size of the dictionary file, given the number of possible conjugations it adds for every verb. :shock:
You did notice in the earlier messages that it went from 167,018 basic dictionary entries (EDICT + AddedWords.txt) to 6,863,153 total entries in the expanded dictionary . . .

Yes, that's why it is so big. Which is why I eventually want to go to dynamically evaluating the conjugations. Not only does the dictionary get smaller, but we actually get to check more extended conjugation possibilities as well.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

But I don't understand why your parser couldn't do easily the same as the program does when it creates the conjugation list for every entry...

Why is it better to have the full list than to try it for each verb every time it encounters one ? :o
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
rpapo
I.D.S.E Humanoid Interface [LSB]
Posts: 1530
Joined: Mon Dec 21, 2009 5:15 am
Favourite Light Novel: Ahouka!
Location: Michigan, USA
Contact:

Re: Rpapo's Translation Assistant.

Post by rpapo »

Lery wrote:But I don't understand why your parser couldn't do easily the same as the program does when it creates the conjugation list for every entry...

Why is it better to have the full list than to try it for each verb every time it encounters one ? :o
That was simply how I coded the program originally. Since then, I have made two attempts at a better parser, but only got a certain distance along on either of them before running into roadblocks. And the need to improve my own Japanese, and to make progress on my translations, has kept me from working on it too much.

Programmatically, it comes down to one fundamental problem: what do you do when the next character you find doesn't fit into your algorithm? You can write an infinite number of special cases, or you can try to design an elegant solution. I don't have the time to do the first, and I need to study somewhat of computational linguistics before tackling the second.

Programming has a fatal attraction to me: I can easily spend tons of time on a problem . . . and not get anything else done in the meantime. It has given me a good paying job (obsession has its benefits...), but in the case of learning Japanese, I have consistently felt I have needed to learn more before playing around in code.

And for the time being, what I have is very useful as it is.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

rpapo wrote:Programming has a fatal attraction to me: I can easily spend tons of time on a problem . . . and not get anything else done in the meantime. It has given me a good paying job (obsession has its benefits...), but in the case of learning Japanese, I have consistently felt I have needed to learn more before playing around in code.

And for the time being, what I have is very useful as it is.
Yeah, I know that, I guess it's because it's like a challenge against yourself. At least for me. :)

And that's true, it's really usefull as is. Thanks for sharing it. :D
Wiki user : Lery (talk)

Sysadmin, sometimes.
User avatar
Lery
I.D.S.E Humanoid Interface [LSB]
Posts: 3343
Joined: Sun Nov 11, 2012 3:23 pm
Favourite Light Novel: Ahouka!
Location: Switzerland

Re: Rpapo's Translation Assistant.

Post by Lery »

Hello ! Long time not seen ^^''

Say, is it normal for the Window not to be resizable ???
Did you do some update in-between ?? Should I download again and "reinstall" it ?
I've just rebuilt the dictionary and got 100'000 more entries, fyi ^^ Thanks again, I like it.
Wiki user : Lery (talk)

Sysadmin, sometimes.
Locked

Return to “Games & Computing”