The proposed mechanism will be as follows:
1. Initial seeding :
--1. All uploads are stopped for one day. A MySQL DB is created for storing all the data.
--2. Data is pulled through the parser API ( example, http://btapi-shadowys.rhcloud.com/, Hopefully this API can get merged into the baka-tsuki domain before this starts to handle the load)
--3. It is then channeled to another application that does the downloading and sorting of all the text and images in each volume and chapter.
--4. Once sorted, the same application will update the DB and record the time it has completed the update.
2. Continuous integration:
--1. The app will monitor MediaWiki each minute ( example using the above API, with the /update route) and if there are changes, the app will begin a new process to read from the mediawiki site and update the DB again.
--2. Each month the app will do a whole site check and update if necessary.
Once this is done we could probably move to using markdown (or orgmode
