OCR version "almost" ready

Discuss topics concerning this volume

Moderators: Fringe Security Bureau, Senior Editors, Senior Translators, Alt. Language Translator/Editor, Executive Council, Project Translators, Project Editors

User avatar
Smidge204
Astral Realm

OCR version "almost" ready

Post by Smidge204 »

I have successfully scanned Volume 9 and converted it to SJIS encoded text. The program I used (Kanji OCR) seems surprisingly accurate, but it still need to be proofread. Proofreading will take me about... three months at this rate. *grin*

Anyway, I got 282 text files now, which need to be looked at and edited (to move the furigana back into position) but are otherwise ready. I'll give them to anyone interested on a per-chapter basis only, which of course would include the raw scans (1-bit bitmaps) used in the OCR process. Each chapter is about 5MB RAR'd.

Let me know what chapter you want, if you're interested.
=Smidge=
User avatar
onizuka-gto
Editor-in-Chief
Posts: 4840
Joined: Wed May 10, 2006 9:02 pm
Favourite Light Novel: Suzumiya Haruhi
Mahouka koukou no Rettousei
No Game No Life
Mushoku Tensei
Mother of Learning
Location: N.E.E.T Federation
Contact:

Post by onizuka-gto »

hey, does the software work for english too?

I'm quite interested in that software as well, any pointers where i can get it?
"Please note, we have added a consequence for failure.Any contact with the chamber floor will result in an unsatisfactory mark on your official test record, followed by death. Good luck."

@Onizukademongto
User avatar
Kanzar
Astral Realm

Post by Kanzar »

There are plenty of OCRs for English - many scanners come with OCR software.
User avatar
Smidge204
Astral Realm

Post by Smidge204 »

I believe it does work for English. The catch is I'm using a demo version that can only be used 20 times / 15 days and there's no crack for it that I could find... I work around the problem by running it in a virtual machine and using a snapshot taken just before I ran it for the first time :D Otherwise it's $99 - $395.

As Kanzar said, there's LOTS of software out there to handle English and nearly all scanners come with something you can use. Asian languages... not so much, it seems. Anyone here living in eastern Asia happen to but a scanner recently?
=Smidge=
User avatar
onizuka-gto
Editor-in-Chief
Posts: 4840
Joined: Wed May 10, 2006 9:02 pm
Favourite Light Novel: Suzumiya Haruhi
Mahouka koukou no Rettousei
No Game No Life
Mushoku Tensei
Mother of Learning
Location: N.E.E.T Federation
Contact:

Post by onizuka-gto »

i need a scanner, i might buy one.

the question is whether it will come with ocr software, and if i can fool the seller im a local and not try and rip me off with tourist prices... :roll:
"Please note, we have added a consequence for failure.Any contact with the chamber floor will result in an unsatisfactory mark on your official test record, followed by death. Good luck."

@Onizukademongto
User avatar
Smidge204
Astral Realm

Post by Smidge204 »

There's always mail-order :P

=Smidge=
User avatar
LQY
Astral Realm

Post by LQY »

onizuka-gto wrote:i need a scanner, i might buy one.

the question is whether it will come with ocr software, and if i can fool the seller im a local and not try and rip me off with tourist prices... :roll:
How about my scan file? :P
User avatar
Dan
Square Mage
Posts: 2361
Joined: Sat Nov 18, 2006 6:53 pm
Favourite Light Novel: Ahouka!
Location: Atlanta, Georgia

Post by Dan »

An OCR puts the scanned file into a text format that can be copy/pasted just like the text on this post. Is your scan in a text document?
User avatar
LQY
Astral Realm

Post by LQY »

Dan wrote:An OCR puts the scanned file into a text format that can be copy/pasted just like the text on this post. Is your scan in a text document?
Personally I do not trust OCR because it provide a lot of proof reading and I do not think it can save time.

I prefer to read the paper and translate it at same time :roll:

BTW, if I am good in Japanese, I will join this team^^
User avatar
Kanzar
Astral Realm

Post by Kanzar »

My HP printer comes with OCR software, and even if your scanner doesn't come with one, it isn't TOO hard to find one.

How bad was the lead I gave you, Smidge?
User avatar
Smidge204
Astral Realm

Post by Smidge204 »

Not so great. There was no Asian language support at all that I could find. :/

Thanks anyway, though!
=Smidge=
User avatar
onizuka-gto
Editor-in-Chief
Posts: 4840
Joined: Wed May 10, 2006 9:02 pm
Favourite Light Novel: Suzumiya Haruhi
Mahouka koukou no Rettousei
No Game No Life
Mushoku Tensei
Mother of Learning
Location: N.E.E.T Federation
Contact:

Post by onizuka-gto »

LQY wrote:
Dan wrote:An OCR puts the scanned file into a text format that can be copy/pasted just like the text on this post. Is your scan in a text document?
Personally I do not trust OCR because it provide a lot of proof reading and I do not think it can save time.

I prefer to read the paper and translate it at same time :roll:

BTW, if I am good in Japanese, I will join this team^^
if you can read chinese, real or screwed you can still join this team. :)
"Please note, we have added a consequence for failure.Any contact with the chamber floor will result in an unsatisfactory mark on your official test record, followed by death. Good luck."

@Onizukademongto
User avatar
Kanzar
Astral Realm

Post by Kanzar »

Smidge204 wrote:Not so great. There was no Asian language support at all that I could find. :/

Thanks anyway, though!
=Smidge=
Eh? It said it did have the support on their main site... =.=
User avatar
LQY
Astral Realm

Post by LQY »

onizuka-gto wrote:
LQY wrote: Personally I do not trust OCR because it provide a lot of proof reading and I do not think it can save time.
I prefer to read the paper and translate it at same time :roll:
BTW, if I am good in Japanese, I will join this team^^
if you can read chinese, real or screwed you can still join this team. :)
Of course, I can read Chinese but I would not use it as translation material because Taiwan Kadokawa's translation version is not really absolutely correct translation from the original Japanese^^||

Well, I prefer to give some advice while you are translating.
User avatar
HolyCow
I.D.S.E Humanoid Interface [LSB]
Posts: 2538
Joined: Sat Nov 25, 2006 6:31 pm
Favourite Light Novel: Ahouka!
Location: Hinamizawa

Post by HolyCow »

Unfortunately some of us translators (like me) can't do Jap --> Eng so the only choice would be to do Chi --> Eng :(
Image
/me claws out throat and dies
Locked

Return to “Volume 9 - The Dissociation of Suzumiya Haruhi / 第九巻: 涼宮ハルヒの分裂”