Chrome Plug-in to Convert Web Page to EPUB

Forum for volunteer developers working on Baka-Tsuki related applications (Baka-Reader, BTprince, etc).

Moderators: thelastguardian, Fringe Security Bureau, Senior Editors, Senior Translators, Alt. Language Translator/Editor, Executive Council, Project Translators, Project Editors

Dragoonity
Reader
Posts: 3
Joined: Sun Jun 19, 2016 2:42 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Dragoonity »

dteviot wrote:
Dragoonity wrote:When I use the extension for the LNs in BT, I found that the cover page image does not show up for me(before clicking or opening up the book). I was wondering if this could be fixed and I'm reading the LNs on my Macbook using the ibooks app on it. This is the only problem I have with it so far and would love to see how far you take this extension for those that need it.
You're the second (third?) person to report this. There's a bug in the opf metadata for the cover image. Updated release will be on the Chrome store in a day or two.
Or you can grab the sonako branch and install from source. https://github.com/dteviot/WebToEpub/tree/sonako
Thanks for the Reply and I did not know that I was either the second or third person that has reported this issue. As thanks for the update on how soon it would be fixed, hope to see much improvement on the extension as time pass. Keep up the good work. :D
dteviot
Literature Club Member
Posts: 31
Joined: Fri Sep 19, 2014 10:02 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by dteviot »

Guest wrote:Could I make a couple of suggestions that would introduce more parity with the design EPUBs created by BTE-GEN?

Firstly, as far as I see using the extension creates a EPUB wherein all the images are renamed to image(number).jpg, would it be at all possible to preserve the Baka-Tsuki filenames instead?

Secondly, the way the old generator was set up, the color illustrations sections would have illustrations set up like so

Code: Select all

<div class="svg_outer svg_inner">
    <svg xmlns="http://www.w3.org/2000/svg" height="100%" preserveAspectRatio="xMidYMid meet" version="1.1" viewBox="0 0 x y" width="100%" xmlns:xlink="http://www.w3.org/1999/xlink">
      <image height="y" width="x" xlink:href="../Images/filename.jpg"></image>
    </svg>
</div>
This creates a nice clean page with just an illustration.

The extension on the other hand packs in the whole gallery's HTML for the Illustrations section, wherein all the illustrations are part of an unordered list:

Code: Select all

<ul class="gallery mw-gallery-traditional">
			<div class="gallerytext">
<p><b>(caption)</b>
</p>
			</div>
		</div></li>
		<li class="gallerybox"><div>
			<div class="svg_outer svg_inner"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="100%" width="100%" version="1.1" preserveAspectRatio="xMidYMid meet" viewBox="0 0 x y"><image xlink:href="../Images/filename.jpg" height="y" width="x"></image></svg>
</div>
<!--And so on and so forth with the rest of the illustrations-->
</ul>
This results in some unsightly bullet points attached to each and every one of the images when viewed in an epub viewer, which for those images that lack captions are simply bullet points with nothing attached to them making them look even more out of place.

If there was any way for the extension to strip out the list code and just leave the images in that would go a long way to making a more aesthetically pleasing outcome.
See https://github.com/dteviot/WebToEpub/issues/4
User avatar
Guest
Astral Realm

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Guest »

Ah, brilliant, glad to see I've been beaten to the punch.

Another suggestion was made on the topic of covers, any possibilities?
Guest wrote:On the topic of cover issues with the extension, could I make a suggestion, to allow the setting of covers using an image from any URL instead of just those available on the page?

For example, take this page

Code: Select all

https://www.baka-tsuki.org/project/index.php?title=The_Zashiki_Warashi_of_Intellectual_Village:Volume9
As visible on that page itself (and therefore available for packing by the extension), the closest one can get to a cover would be

Code: Select all

https://www.baka-tsuki.org/project/images/3/31/Zashiki_v09_000.jpg
However, this is too wide as it includes the front cover, the spine, and the back cover as well.

If on the other hand one were to check the main series page,

Code: Select all

https://www.baka-tsuki.org/project/index.php?title=The_Zashiki_Warashi_of_Intellectual_Village
there is a much better option available to act as a cover, not present on the volume's full text page.

Code: Select all

https://www.baka-tsuki.org/project/images/3/30/Zashiki_Volume_9_Cover.jpg
As-is, the extension does not allow for setting this as the cover, and therefore the epub needs to be manually tweaked after the fact to replace the cover.

Since I'm uncertain if there is any practical easy one-size-fits-all fix to somehow magically detect the presence of a cover image on a page other than the one being viewed, then a solution could be to allow entering an image URL to fetch a specific image to act as cover.
dteviot
Literature Club Member
Posts: 31
Joined: Fri Sep 19, 2014 10:02 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by dteviot »

Guest wrote:Another suggestion was made on the topic of covers, any possibilities?
See: https://github.com/dteviot/WebToEpub/issues/9
If you have any additional requests, please go to https://github.com/dteviot/WebToEpub/issues, click the "New issue" button and create the entry there.
Index
Mikuru's Master
Posts: 28
Joined: Fri May 08, 2015 7:00 pm
Favourite Light Novel: Toaru Majutsu No Index

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Index »

Just so everyone knows. I've covered some ground as far as pull requests/issues go and i will continue to do so.
I plan to stay very active in assisting with development for the now chrome + firefox(nightly) extension.

Keep the issues and suggestions coming and i will try to get each one knocked out in a timely manner.

Also as for the illustration stuff i covered that here as well as a bunch of other stuff i wanted to get done. https://github.com/dteviot/WebToEpub/pull/10 this pull requests covers https://github.com/dteviot/WebToEpub/issues/8 and https://github.com/dteviot/WebToEpub/issues/4

The next pull request will take care of https://github.com/dteviot/WebToEpub/issues/9

After that it will be https://github.com/dteviot/WebToEpub/issues/12 and then https://github.com/dteviot/WebToEpub/issues/11

P.S. Who do i talk to about getting my username changed to belldandu on the forums?
dteviot
Literature Club Member
Posts: 31
Joined: Fri Sep 19, 2014 10:02 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by dteviot »

Guest wrote:Firstly, as far as I see using the extension creates a EPUB wherein all the images are renamed to image(number).jpg, would it be at all possible to preserve the Baka-Tsuki filenames instead?
That looks kind of tricky. Please see: https://github.com/dteviot/WebToEpub/issues/11 and let me know if either of the alternatives will work for you
dteviot
Literature Club Member
Posts: 31
Joined: Fri Sep 19, 2014 10:02 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by dteviot »

Plugin has been submitted to Mozilla for review.
When finishes review (in maybe three days) should be available from https://addons.mozilla.org/en-US/firefo ... aka-tsuki/
Last edited by dteviot on Fri Jun 24, 2016 2:03 pm, edited 1 time in total.
User avatar
Guest
Astral Realm

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Guest »

dteviot wrote:
Guest wrote:Firstly, as far as I see using the extension creates a EPUB wherein all the images are renamed to image(number).jpg, would it be at all possible to preserve the Baka-Tsuki filenames instead?
That looks kind of tricky. Please see: https://github.com/dteviot/WebToEpub/issues/11 and let me know if either of the alternatives will work for you
The Baka-Tsuki image filenames to the best of my knowledge invariably only use standard alphanumeric characters, numbers, and underscores, nothing that would confuse a filesystem or path.

If I'm not wrong, unlike chapter titles which appear in the body of a text and can contain mysterious symbols and slashes, these are filenames which a browser has to be able to point to and which can be saved on a PC after all.
Index
Mikuru's Master
Posts: 28
Joined: Fri May 08, 2015 7:00 pm
Favourite Light Novel: Toaru Majutsu No Index

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Index »

dteviot wrote:
Guest wrote:Firstly, as far as I see using the extension creates a EPUB wherein all the images are renamed to image(number).jpg, would it be at all possible to preserve the Baka-Tsuki filenames instead?
That looks kind of tricky. Please see: https://github.com/dteviot/WebToEpub/issues/11 and let me know if either of the alternatives will work for you
As with what dteviot said. I will attempt to preserve as Much of the original filename as possible, minus any special characters. And for Sanity sake it will most likely be done the same way i did chapter titles.
dteviot
Literature Club Member
Posts: 31
Joined: Fri Sep 19, 2014 10:02 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by dteviot »

Guest wrote:The Baka-Tsuki image filenames to the best of my knowledge invariably only use standard alphanumeric characters, numbers, and underscores, nothing that would confuse a filesystem or path.
I'll point out
  • This generator is going to work with sites other than Baka-Tsuki.
  • I'm pretty sure I've seen '?' characters in the file's URL. It's the searchpart delimiter, used to specify the resolution to deliver, and ISN'T a legal filename character on Windows.
Also, we're not quite talking about the same thing. You're referring to name of image files. I'm also talking about the XHTML files holding chapter text. (Where it's been suggested they get the chapter name as their file name.)
Index
Mikuru's Master
Posts: 28
Joined: Fri May 08, 2015 7:00 pm
Favourite Light Novel: Toaru Majutsu No Index

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Index »

dteviot wrote:
Guest wrote:The Baka-Tsuki image filenames to the best of my knowledge invariably only use standard alphanumeric characters, numbers, and underscores, nothing that would confuse a filesystem or path.
I'll point out
  • This generator is going to work with sites other than Baka-Tsuki.
  • I'm pretty sure I've seen '?' characters in the file's URL. It's the searchpart delimiter, used to specify the resolution to deliver, and ISN'T a legal filename character on Windows.
Also, we're not quite talking about the same thing. You're referring to name of image files. I'm also talking about the XHTML files holding chapter text. (Where it's been suggested they get the chapter name as their file name.)
Ignore the search part delimiter thats not even gonna be a problem. I plan to just split the page url at "file:" and use whats after as the file name.

And example of this would be.

Image Page Url: https://baka-tsuki.org/project/index.ph ... 1_000a.jpg

Code: Select all

let page = "https://baka-tsuki.org/project/index.php?title=File:BTS_vol_01_000a.jpg"
let image = page.split(/file:/gi)[1];
Where image would be "BTS_vol_01_000a.jpg"
I plan to use the external function call logic here in util like i did with the chapter name which should help keep the issue of other sites at bay.

I will also do this regardless of the resolution setting to prevent having to worry about extra parameters getting in the way,

Edit: There done.
User avatar
R~S
Project Editor
Posts: 131
Joined: Sun Oct 01, 2006 9:57 am
Favourite Light Novel: Index!
Location: France

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by R~S »

dteviot wrote:Plugin has been submitted to Mozilla for review.
When finishes review (in maybe three days) should be available from https://addons.mozilla.org/en-US/firefo ... aka-tuski/
You got Baka-Tsuki misspelled there :P
Baka-Tsuki discord server
Index
Mikuru's Master
Posts: 28
Joined: Fri May 08, 2015 7:00 pm
Favourite Light Novel: Toaru Majutsu No Index

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Index »

R~S wrote:
dteviot wrote:Plugin has been submitted to Mozilla for review.
When finishes review (in maybe three days) should be available from https://addons.mozilla.org/en-US/firefo ... aka-tuski/
You got Baka-Tsuki misspelled there :P
AHAHAHAHA omg i lost my shit for a second when i realized that. I can't believe i didn't notice that when i made the review.

Rip baka-tuski xD

Also whoever put the feature request for optional removal of duplicate images in the review for the plugin, i will report your review as it does not follow firefox's review guidelines.

You DO NOT ask for features on that page. You do it on our github.

Also learn to be patient as that feature is Already Being Worked On.

Aside from that, your review is neither informative nor helpful so i have reported it as a misplaced bug report / feature request.
User avatar
Guest
Astral Realm

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by Guest »

Index wrote: Ignore the search part delimiter thats not even gonna be a problem. I plan to just split the page url at "file:" and use whats after as the file name.

And example of this would be.

Image Page Url: https://baka-tsuki.org/project/index.ph ... 1_000a.jpg

Code: Select all

let page = "https://baka-tsuki.org/project/index.php?title=File:BTS_vol_01_000a.jpg"
let image = page.split(/file:/gi)[1];
Where image would be "BTS_vol_01_000a.jpg"
I plan to use the external function call logic here in util like i did with the chapter name which should help keep the issue of other sites at bay.

I will also do this regardless of the resolution setting to prevent having to worry about extra parameters getting in the way,

Edit: There done.
Why not just parse through to the original file at https://baka-tsuki.org/project/images/b ... 1_000a.jpg and save that directly?
dteviot
Literature Club Member
Posts: 31
Joined: Fri Sep 19, 2014 10:02 pm
Favourite Light Novel:

Re: Chrome Plug-in to Convert Web Page to EPUB

Post by dteviot »

R~S wrote:
dteviot wrote:Plugin has been submitted to Mozilla for review.
When finishes review (in maybe three days) should be available from https://addons.mozilla.org/en-US/firefo ... aka-tuski/
You got Baka-Tsuki misspelled there :P
D'oh!

Fixed.
Now https://addons.mozilla.org/en-US/firefo ... aka-tsuki/
Post Reply

Return to “Developers and Code”