Wiki Categorization Survey and Ideas

Do you have a fully fleshed idea that you think Baka-Tsuki should adopt? Post it here.

Moderators: thelastguardian, Fringe Security Bureau, Senior Editors, Senior Translators, Alt. Language Translator/Editor, Executive Council, Project Translators, Project Editors

Wiki Categorization Survey and Ideas

Postby Cthaeh » Sat Sep 26, 2015 8:00 am

There's a little bit of discussion going on about adding or removing some categories. I'm interested in hearing from both normal/casual readers as well as more involved community members.

Main Questions:

Which categories are (or would be) most useful? Which ones do you use now?



More specific topics:

Genre tags* (I'm most interested in the answers to this one)
Genres are difficult to maintain and somewhat arbitrary (they also tend to work poorly with Alt languages, in that English project and Alt languages end up mixed together), so there's discussion of removing them. Are there many people that use these on BT as they are now?

I know they are useful in general, but I don't think they're too useful as they are now since you can't do a search based on multiple genres, only look at a single genre at a time (actually, you can look at combinations of genres, but it's a pain to do, so I don't think it can really count as a feature). I thought there are some external sites (not entirely sure) that have better genre systems in place, so I'm not as sure they should be on BT given their problems.

Status tags
Some ideas brought up for discussion were removing status tags (Active, Idle, Stalled) since the person who used to keep them updated no longer has the time. So in particular, how many people think those are useful?

There was an idea to make it automatically switch to inactive if no one updates it, but it wouldn't have the "Active" banner at the top, just be in the Active category, so it'd just help by making a list of active projects on the category page. So is that still useful, or would it not be as useful without the banner on top?

Origin tags
Is it useful to be able to search for the original language of the work (Japanese, Korean, Chinese)? Most projects are Japanese light novels, but I think there's a few Korean and Chinese at least. On one hand there's nothing wrong with more features, I'm just curious if enough people would want this to justify the extra category.

Linked projects separated from BT projects
The current "Light novel" (more than 1 volume done) and "Teaser" (less than 1 volume done) categories mix together project that are hosted on BT and "projects" that are just links to external sites. Should those two things be separated instead? Are you ever looking for only BT projects (or only linked)? I personally think it's fine as is, but there are at least a few people, maybe more, who feel they should be separated, so maybe I'm the minority here.


Also, feel free to comment on other category related topics that I didn't list.
Cthaeh
Yuki-Nagator
 
Posts: 647
Joined: Sun Nov 11, 2012 6:54 pm

Re: Wiki Categorization Survey and Ideas

Postby TheCatWalk » Sat Sep 26, 2015 9:52 am

I don't think you should separate the linked projects.
Might be a bit selfish of me but i think
if you go out of your way to separate linked projects
BT will automatically turn into a light novel 'Directory'
for those who don't read too much here.
And frankly that pisses me off.... BUT If there is a way to do that without the end result being BT turning into some directory
the I'm all in for it.

As for the tags..
I haven't thought up of anything yet.
MAIN HEROINES FTW!!!!!!!!!!!!!NYAAAAAAAAAAAAAAAAAAAAAAAAAA!
Image
User avatar
TheCatWalk
Project Translator
 
Posts: 339
Joined: Wed Jan 09, 2013 8:17 am
Location: Catville in the middle of nowhere

Re: Wiki Categorization Survey and Ideas

Postby arczyx » Sat Sep 26, 2015 6:22 pm

Cthaeh wrote:Genre tags

I think we should just get rid of this altogether. Like you said, they are not really useful right now and there are much better alternatives anyway (MangaUpdates or even MAL).

Cthaeh wrote:Status tags

I personally think this is useful, for example it stops people from asking whether a project dead or not. So please keep this if possible.

Cthaeh wrote:Origin tags

This is basically just another genre tags really. Get rid of it to make things simpler. On the other hand, maybe we should add links to MU and MAL telling people to go there instead to fulfill their needs of exploring the world of LN.

Cthaeh wrote:Linked projects separated from BT projects

I think it's fine as it is.


If I may add something, we should just get rid of author tags as well (as it is yet another genre tags). Simpler is better.
User avatar
arczyx
Project Editor
 
Posts: 810
Joined: Wed Jun 29, 2011 3:52 am

Re: Wiki Categorization Survey and Ideas

Postby Cthaeh » Sat Sep 26, 2015 6:45 pm

arczyx wrote:If I may add something, we should just get rid of author tags as well (as it is yet another genre tags). Simpler is better.

Agreed. (Given the thread in the category thread in the admin forum, I was considering author tag removal a done deal and didn't bother mentioning it here)
Cthaeh
Yuki-Nagator
 
Posts: 647
Joined: Sun Nov 11, 2012 6:54 pm

Re: Wiki Categorization Survey and Ideas

Postby Shadowys » Thu Oct 01, 2015 7:58 am

I'm thinking that the purpose of this categorization survey is mainly for translators. Correct me if I'm wrong.

I would not be commenting on this thread as a translator, and instead I will be commenting as a developer and reader. I'm developing an API that pulls data out of the various pages we have now and output it into a sane format for programs to chomp in and could be delivered to end-users.

Example output here: https://baka-tsuki-api.herokuapp.com/ap ... ujo|comedy

Genre tags
---------------------
My genre api pulls data out of the categories pages (I'm planning to include the authors if the support is complete), and while mediawiki does not make this easy, it is still doable. The main concern is for developers making apps that allow readers to explore light novels with similar genres and tags, directly from their baka tsuki app.

EDIT: On second thought, I went on to implement the category search. This enables developers to develop applications that can let readers fine tune their exploration, which, from personal experience of reading 入間人間's novels, which span across different genres but all contain his personal writing style, is something I'd like to have.

You can view it in here: baka-tsuki-api.herokuapp.com/api/category?list=MF Bunko J&genres=comedy|harem&type=Light Novel&language=english



I'm not pulling data out of MAL or manga updates because
1. there is too much noise, so it takes time to download each search. (MAL is anime first, manga second, light novels not really mentioned. MangaUpdates includes everything.)
2. both apis are not openly documented or complete (MAL api is nearly non-existent, and pretty useless. Both do not have good open docs, so it makes it harder to develop anything because there is no changelog, so it is unsure if it is stable or how to do anything at all.)

I'm also not pulling data directly out of their pages because of both base their page url on ids, making it hard and cost network bandwidth and time to make a workaround.

Adding genre tags however, is timely for translators because the wiki is really more of a compromise between the needs of a translator, reader, and developers, does not allow auto complete at all, or search in anyway.

Nevertheless, Genre tags, are invaluable to the reader, and for developers of the api because of BakaTsuki is LN-first.

For genre tags I would propose **all of the languages use the same genre tag regardless of language, meaning English.**

Status tags
-------------------------

The status of the project is somehow important for the reader on whether they would continue reading or not. An abandoned project can be used to warn readers in the applications that developers develop.

Origin tags
-------------------------

IMHO, Useless to the reader, unless they want to buy it, and that could be directed to google.

Linked projects
------------------------

Frankly I don't see any justification in separating them from BT projects.

Further comment
-----------------------

Tags are wonderful metadata for connecting data, and I'm thinking of using tags to index databases. All of this leads to an ease into developing a better system for translators and readers, which IMO will lead to a further increase in the growth of the community.

However, adding tags is a pain. There should be an automated protocol that implements the tags straightaway. For example, the active project tag will be activated when the project has been updated within three months, and the author and publisher as a tag automatically.

The wiki system might not make this easy to do. I'm planning to include an user posting API for version 2 to make it easy for developers to make an application for translators, which will be automatically uploaded to the wiki in wikitext, or simply loaded in a Database hosted on our servers.

Either that or I'll work on creating an application that allows real-time collaboration between translator-translator and reader-translator groups. (sorta like this: https://github.com/cloudiirain/onigiri

Overall, I'm for making Baka-Tsuki a Light-Novel exclusive website for translators and readers.
Winter's the time of the year,
when the cold chill the skin,
from the very within,
but you grasped my hand,
your eyes shedding a frozen tear.
Our eyes met,
and warmth filled the air.
User avatar
Shadowys
Project Translator
 
Posts: 246
Joined: Sun Dec 30, 2012 5:15 am
Location: Somewhere in Malaysia

Re: Wiki Categorization Survey and Ideas

Postby Cthaeh » Thu Oct 01, 2015 5:02 pm

Shadowys wrote:I'm thinking that the purpose of this categorization survey is mainly for translators. Correct me if I'm wrong.

I was looking for all opinions; translators, readers, anyone who would use them (or have to deal with them).

Status tags
Shadowys wrote:However, adding tags is a pain. There should be an automated protocol that implements the tags straightaway. For example, the active project tag will be activated when the project has been updated within three months,

Great idea, how would you do that though? Because they're not up to date right now.

I've thought about it a bit, and the best thing I could come up with was extracting the date from the recent updates section of a page; it's possible to make it completely automatic within the wiki if the recent updates section is kept up to date. The problem there is that the recent updates section for a fair number of projects isn't kept up to date, or even used at all in some cases. So for all the projects whose recent updates sections are seldom used it'd end up categorizing them as inactive even though there are updates.

The problem is there's no in-wiki, automated way to tell if a page edit or related change is a chapter update or just another edit. Even outside the wiki, while you could code up something that does most of the work, it won't be perfect without human involvement to double check the results (I don't think you can come up a parser/logic robust enough to work for everything). The issue with an external system there is your depending on a single person to keep the status tags updated. That's the issue that brought it to the current point, the person who was keeping them updated isn't anymore. I've thought about taking updating the status tags over myself, but I'm hesitant to do so because it doesn't solve the underlying issue, just delays it until I leave / stop doing it.

The question I'm undecided on is if any of those imperfect systems would work well enough to make it worth it. (I'm currently thinking most about the inwiki, auto-parsing of the recent updates section of the project page. That way "fails" in the more conservative direction of labeling projects as inactive if no one marks the updates on the page.)


Genre tags
While I agree that they are something people would use, right now I don't think there is any practical tool that makes use of them (since you can't really get a list of category intersections on the wiki); your api does put them to use, but doesn't have an interface as of yet; I suppose someone probably could make a simple page/interface to use your api.

Genre tags also suffer from a similar issue to status tags in that there's not an easy way to maintain them (mainly add them to new projects), but it's probably true the consequences of projects lacking genres are less than project status not being updated. And just as a note, there may be some translators who prefer not to have genre tags (or status tags). But the main reason I might support removing them is because they don't serve much purpose without a tool to use them (my official vote right now is to keep them, but I'm wavering towards removal, which is one of the reasons I started this thread).


Last topic
Shadowys wrote:Tags are wonderful metadata for connecting data, and I'm thinking of using tags to index databases.
[...]
However, adding tags is a pain. There should be an automated protocol that implements the tags straightaway. For example, [...] and the author and publisher as a tag automatically.

One comment here is that I can't really picture an "automated protocol" for implementing any tag, such as author or publisher. It seems like someone is always going to just have to type that in.

A second comment: I do want to support external tools for the wiki, as I think that's a good way to take advantage of BT's large collection of novels, however I'm not sure I would want to use categories for every piece of data. I was never a fan of the author and publisher tags, as I don't think they are of much use on the wiki. Publisher categories at least have multiple entries, but most author categories aren't going to have more than one or two, which isn't much use as a category from a media wiki perspective.

Just thinking, if you want some way to tag authors, is any other convenient way to do it? For example, could your api make use of something like following: page source text = "This LN is written by {{Author|some author's name}}" / page display = "This LN is written by some author's name".
Cthaeh
Yuki-Nagator
 
Posts: 647
Joined: Sun Nov 11, 2012 6:54 pm

Re: Wiki Categorization Survey and Ideas

Postby Shadowys » Thu Oct 01, 2015 7:17 pm

Status tags
In-wiki, I've done the api to check the stats of a page here : https://baka-tsuki-api.herokuapp.com/ap ... o_Tsukaima
Each chapter is treated by the wiki as a page, from the {project_name} so the application has to go through
1. Get all chapters with baka-tsuki-api.herokuapp.com/api?title={project_name}
2. Pass all of them through baka-tsuki-api.herokuapp.com/api/time?titles={chapter_page1|chapterpage2|...}
3. Sort and find the most recent change.
4. Compare to the date now so see if it has been updated.
5. Add tag

For searching for all recent updates: https://baka-tsuki-api.herokuapp.com/ap ... updates=20. Note that the max available is 500 up till now as Heroku does not provide a lot of memory for a free package, so it will be better to search for all light novels under each language and do this recursively.

If it is an application outside the wiki (not using the wiki system at all), we could have just added a "chapter" tag and done a few queries to the db which would have been faster and could be done trivially.

Genre tags
The reason why we haven't use them is because we don't really have an application that could act as a reader. Adding genre tags is a pain because there is no autocomplete, or a system where you just have to click on existing tags. My api is going to aim for cleaning up the user api next, so it will be trivial to add a page to multiple categories, and thus make it easier for people to create a interface where project managers just have to click on the tags available to add it into the page, or type a new one (kinda like how stackoverflow does it)

However, if it is crucial for people to see how the genre tags can be utilised, I could spend tomorrow hacking out a sample web reader that uses the API to explore genres :D

Last topic
My current api is able to pull author and illustrator data (if any) through the title, the volume, or table of content. That could be utilised to add tags automatically.

Baka Tsuki remain quite small (at around 200+ entries only) so the author categories haven't really gained a lot of entries, for example Eiji Mikage has at least three entries on Chinese novel sites, but the main purpose of the tags is for searching and sorting light novels.

Special mention
I'm thinking of creating a input parser for new projects, where data will be context specific.

Example input:
:manager Somebody
:translator
_:- name1
:title Hyouka
:genres genre1,genre2
:series The Hyouka / Classics Club Series (〈古典部〉シリーズ)
_:author Honobu Yonezawa
_:publisher Kadokawa Shoten
_:synopsis Oreki Houtarou is a self-proclaimed "energy-saver"—that is, he refuses to actively waste energy doing things that aren't necessary.
_:volumes
__:1 Hyouka 氷菓 - You can't escape / The niece of time
__:1 Letter from Benares
___:chapter Afterword
_:sections
__:- The Hyouka Anime Drama CDs
__:section Drama CD 1 - TVアニメ 氷菓 ドラマCD1
___:1 The Four Famous Chuunibyou Families

Note: "_" means two spaces indentation.
which would be parsed into wikitext with the respective tags, sections, and all relevant data inserted into it, for example guidelines and stuff, including a introduction statement.
Winter's the time of the year,
when the cold chill the skin,
from the very within,
but you grasped my hand,
your eyes shedding a frozen tear.
Our eyes met,
and warmth filled the air.
User avatar
Shadowys
Project Translator
 
Posts: 246
Joined: Sun Dec 30, 2012 5:15 am
Location: Somewhere in Malaysia

Re: Wiki Categorization Survey and Ideas

Postby Cthaeh » Thu Oct 01, 2015 9:33 pm

Shadowys wrote:Status tags
In-wiki, I've done the api to check the stats of a page here : https://baka-tsuki-api.herokuapp.com/ap ... o_Tsukaima
Each chapter is treated by the wiki as a page, from the {project_name} so the application has to go through
1. Get all chapters with baka-tsuki-api.herokuapp.com/api?title={project_name}
2. Pass all of them through baka-tsuki-api.herokuapp.com/api/time?titles={chapter_page1|chapterpage2|...}
3. Sort and find the most recent change.
4. Compare to the date now so see if it has been updated.
5. Add tag

You could do that, but I don't think we'd want that automated process to run without human checking. Everything other than 3+4 should be pretty reliable, but you need to be able to check if an update is new translated content to determine if the status tag needs to be updated (not every edit should count).

For example, this first edit should not count towards a status update, but this second edit should count for a status update. The problem is there's no easy way for an automated process to tell the difference between those two edits (they both result in a similar number of bytes changed).


Genre tags
Shadowys wrote:However, if it is crucial for people to see how the genre tags can be utilised, I could spend tomorrow hacking out a sample web reader that uses the API to explore genres :D

I can envision an application/gui that would make them useful, so I don't personally think you need to rush to make a demonstration. But I'm not the only one who would need to be convinced, so it may be a good idea to rope the others into this particular discussion.
Cthaeh
Yuki-Nagator
 
Posts: 647
Joined: Sun Nov 11, 2012 6:54 pm

Re: Wiki Categorization Survey and Ideas

Postby Shadowys » Thu Oct 01, 2015 11:20 pm

So I went ahead to create one to show the usage of genre tags:

https://baka-tsuki-api.herokuapp.com/webindex.html

Currently it only takes in genre tags as this is only an example of what the data can be used for.

For reviewing status tags, the wiki does not provide an easy way to check if it is a page update or not. Threeproposed ways:
1. Add a tag in the comments "[UPDATE]" if the translator is adding new content.
2. Check if the revision and the one before it has a huge difference margin, where the text deleted is very less compared to what is added. Example using Mediawiki api https://www.baka-tsuki.org/project/api. ... o_Tsukaima
3. Forgo the wiki altogether and use an external application that allows the translator to click on a button to show that the text is a page update.
Winter's the time of the year,
when the cold chill the skin,
from the very within,
but you grasped my hand,
your eyes shedding a frozen tear.
Our eyes met,
and warmth filled the air.
User avatar
Shadowys
Project Translator
 
Posts: 246
Joined: Sun Dec 30, 2012 5:15 am
Location: Somewhere in Malaysia

Re: Wiki Categorization Survey and Ideas

Postby zzhk » Fri Oct 02, 2015 2:09 am

I really don't see why so much effort needs to be devoted to maintaining project status tags.

Why can't readers simply read the "Updates" section with their own eyes to see when a project was last updated?

It makes sense in principle to identify the cultural background of the original work. You wouldn't read Korean, Chinese and Japanese stories with the same cultural assumptions and perspective. In practice, this is no longer much of an issue after distinguishing web novels from light novels--people aren't exactly translating Chinese and Korean print novels in droves, instead, they rely on the ready availability of web novels--and to a lesser extent, rejecting links to blogs/sites engaging in commercial activity.
User avatar
zzhk
Senior Project Translator
 
Posts: 535
Joined: Tue Mar 20, 2012 2:52 pm

Re: Wiki Categorization Survey and Ideas

Postby Cthaeh » Fri Oct 02, 2015 5:24 am

Shadowys wrote:So I went ahead to create one to show the usage of genre tags:

https://baka-tsuki-api.herokuapp.com/webindex.html

Currently it only takes in genre tags as this is only an example of what the data can be used for.

Looks cool. I suppose it probably can't handle heavy traffic if we put it up on the site right now? Though even if it could, we should probably push to come to an official decision about what we're going to do with genre tags before doing that or putting much more effort into it.

Shadowys wrote:For reviewing status tags, the wiki does not provide an easy way to check if it is a page update or not. Threeproposed ways:
1. Add a tag in the comments "[UPDATE]" if the translator is adding new content.
2. Check if the revision and the one before it has a huge difference margin, where the text deleted is very less compared to what is added. Example using Mediawiki api https://www.baka-tsuki.org/project/api. ... o_Tsukaima
3. Forgo the wiki altogether and use an external application that allows the translator to click on a button to show that the text is a page update.

My opinion, 1 and 3 won't work well because they require a translator to do something extra that they wouldn't normally know of or think to do. 2 is how I would go about it for manual input method, such that I'd have some code to generate a list of those revisions and have a person go through the list to double check it before it gets applied to the wiki.

If we choose to keep status tags, what I think would probably be the most effective is doing the in-wiki parsing of the recent updates section, so any series with a translator (or anyone else) who does that normally would be covered. Then combine that with the #2 + manual checking to flag and update any series for which the translator doesn't use a recent updates section to cover the rest. That way even if the person running the updates dropped out, it'd still at least be semi-functional (and because it runs off an input date, it would automatically move things to inactive with time).

For clarity, this is how I would implement it on a project page:
Spoiler! :
===Recent Updates===
{{Auto-status|caught_up=0|manual_override=September 1, 2015|recent_updates=
:*October 2, 2015 - Volume 1 Chapter 5
:*9/26/2015: Volume 1 Chapter 4
:14-September-2015
:::Volume 1 Chapter 3
}}

caught_up is a flag to suppress the status for caught up projects

manual_override is what you'd have a bot update if you were using a bot, the template compares the first date in recent updates to the manual override and uses whichever one is more recent for determining status.

I used various formats in this example to demonstrate the fact it would be fairly robust with different formats for the recent updates section; it works with all of those formats for the update section.
But again, after we flush out these ideas, we should push for an official ruling before putting too much effort into it (technically there is an official ruling to remove them... I better switch my vote to undecided to stall it it).

zzhk wrote:Why can't readers simply read the "Updates" section with their own eyes to see when a project was last updated?

I consider it debatable (I'm not decided) whether the effort to maintain them is worth it, but I think the value would be allowing readers to search out the recently updated projects without having to go to each project to check. Also, the "Updates" section isn't always used, so that doesn't always work for checking (it's possible if the reader knows enough to do the extra work to check the wiki history); but then again, that fact is at least a partial problem even with my proposed updating method.
Cthaeh
Yuki-Nagator
 
Posts: 647
Joined: Sun Nov 11, 2012 6:54 pm

Re: Wiki Categorization Survey and Ideas

Postby Shadowys » Fri Oct 02, 2015 8:23 am

Nope, it can't. It's not production ready and is more of a toy rather than a serious implementation.

Can you link to the voting thread? I can't seem to find it.
The fact is that these data can be used to lower the amount of work the readers need to do while browsing the site, but it has to be used in conjunction with upgrading the technology we have now (i.e. we need more developers) to minimise the manual work the project manager and translator has to do.
Nevertheless, this has to wait until the result of the discussion is out before any spec can be drawn.
Winter's the time of the year,
when the cold chill the skin,
from the very within,
but you grasped my hand,
your eyes shedding a frozen tear.
Our eyes met,
and warmth filled the air.
User avatar
Shadowys
Project Translator
 
Posts: 246
Joined: Sun Dec 30, 2012 5:15 am
Location: Somewhere in Malaysia

Re: Wiki Categorization Survey and Ideas

Postby Shadowys » Fri Oct 02, 2015 7:03 pm

Edit: I'm changing my stance on origin tags from being useless to useful, though it would be another thing to maintain, but different novel origins dictate the background that the author has assumed on the audience having.
Winter's the time of the year,
when the cold chill the skin,
from the very within,
but you grasped my hand,
your eyes shedding a frozen tear.
Our eyes met,
and warmth filled the air.
User avatar
Shadowys
Project Translator
 
Posts: 246
Joined: Sun Dec 30, 2012 5:15 am
Location: Somewhere in Malaysia


Return to Proposals and Suggestions

Who is online

Users browsing this forum: No registered users and 1 guest

cron