MediaWiki/reverse engineering

Ok, enough of trying to find the right queries after the upgrade to 1.5 mediawiki.

Lets just log the sql queries from the server.

get page_id
SELECT page_id FROM `page`  WHERE page_namespace = '0' AND page_title = 'Ogg'  LIMIT 1;

get page_latest
SELECT page_id,page_namespace,page_title,page_restrictions,page_counter,page_is_redirect,page_is_new,page_random,page_touched,page_latest,page_len FROM `page`  WHERE page_id = '17369'  LIMIT 1;

after having page_latest (32808) ,which is = rev_id in revision we can:

get revision_text_id
SELECT rev_text_id from revision where rev_id=32808;

after having rev_text_id (32770), which is = old_id, we can finally:

get the actual content
SELECT old_text,old_flags FROM `text`  WHERE old_id = '32770';

brion: old_id=rev_text_id brion: rev_id=page_latest brion: rev_page=page_id

Searching for content
To seach for content in pages and return page titles that match, we would have to revert the process:

get old_id of matching page
select old_id from text where old_text LIKE "%vollmilch%" limit 1; ++ ++ ++
 * old_id |
 * 32778 |

get rev_page from old_id
select rev_page from revision where rev_text_id=32778; +--+ +--+ +--+
 * rev_page |
 * 10425 |

result / page_title
select page_title from page where page_id=10425; ++ ++ ++
 * page_title |
 * Sandbox   |

Multiple search results
select old_id from text where old_text LIKE "%finn%" LIMIT 5; ++ ++ ++
 * old_id |
 * 19561 |
 * 12621 |
 * 12622 |
 * 12682 |
 * 12683 |

select distinct rev_page from revision where rev_text_id IN (19561,12621,12622,12682,12683); +--+ +--+ +--+
 * rev_page |
 * 9559 |
 * 9609 |
 * 13622 |

select page_title from page where page_id IN (9559,9609,13622); ++ ++ ++
 * page_title    |
 * Debian/Posters |
 * Donations     |
 * Forum         |

...OR, one could as well just use the searchindex table:

select si_page,si_title from searchindex where si_text LIKE "%Vorbis%"; +-+---+ +-+---+ +-+---+
 * si_page | si_title |
 * 9746 | gnump3d  |
 * 17369 | ogg      |
 * 17368 | zinf     |
 * 17367 | icecast  |
 * 15735 | converter |

but we want the nice page titles, not this lowercase version, so we join the table with page, since si_page = page_id

select page_title from searchindex join page on page_id=si_page where si_text LIKE "%Vorbis%"; ++ ++ ++
 * page_title |
 * Converter |
 * GnuMp3d   |
 * Icecast   |
 * Ogg       |
 * Zinf      |

dddooh, so much easier. ;)

ok, got it
ok,with newer pages i can use rev_id=page_latest and old_id=rev_text_id, but with older pages i have equal values in rev_id and rev_text_id and i get an empty set if i then try to select page_title from page where page_latest=

mutante: rev_page=page_id

page_latest points to the latest rev_id in the page for convenience