From Baka-Tsuki
Jump to navigation Jump to search

Proof of Concept of Category Intersections[edit]

Brought to you by Extension:DynamicPageList (TLG already installed it!)

Sample: Intersection of Light Novel (English) and Stalled Categories[edit]

Testing if subcats are included in intersections[edit]

No pages meet these criteria.

Cloud's Proposal[edit]

Currently, we have a category called Light Novel (English), and twenty variations of this for every alternative light novel language.

Rather have having these 21+ redundant categories, it is more logical to place project pages in their primary categories, and take intersections of these categories.

For example, if all light novel projects of all languages were listed in Category:Light novel, and all English projects were listed in Category:English, taking the intersection of these two categories is effectively the same as getting: Light Novel (English).

Why is this better?[edit]

Less categories overall. Simpler organization. Readers get to see what they want and only what they want.

For example, right now, I'd like to be able to see all English Projects that are stalled. However, the current Category:Inactive Projects lists all inactive projects of all languages... which we have 40 of, which is not easy to look at. It's the problem the old Teaser Projects category page had.

I could request that we split that category up into Inactive Projects (English), but do we really need that many more categories? It's unnecessary, and certainly not ideal. Fully populated primary categories are better and more powerful, if intersections between categories can be easily performed.

What are the cons?[edit]

Calculating intersections of categories can be a memory intensive process if the categories are large (like 1000+ members). Memory-wise, it's an O(n^2) process. Luckily this extension doesn't search subcats, so its not as bad as a could be). As such, TLG (the default setting of the extension) has limited the number of results that can be shown from an intersection of two categories to be somewhere around 200 results.

However, as Baka-Tsuki is small, this actually doesn't pose a problem until we ever get more than a couple hundred projects. This memory issue primarily plagues big Wiki's like Wikipedia where there can be 10,000+ pages in a category. Fortunately, we're not at that size. We can afford this kind of process that loops through 100 pages. If you're skeptical, the intersection you're looking at above is an intersection of a 121-size-category and a 40-size-category.

This extension is installed on and currently used on Wikimedia projects: Wikinews, Meta, Wikibooks, and Wikiversity. Does that make you feel better about scalability and usage?

The other con is that it isn't a real category page. We could link it to users and display like a category page (like I did above), but the default output is a straight list of results.