How To Run A Query To Find A String In Blob Files?
Mediawiki has a table in the database 'text' which contains the page content. It is saved as a [BLOB] file. I would like to run a query to search through all the text on the site
Solution 1:
The Mediawiki markup text is stored in the old_text
field, which is a mediumblob type. You can query it like any other text-based field. MySQL will cast your string into binary for the query. Note that this is a case-sensitive search!
select old_id fromtextwhere old_text like"%string%";
If you need case-insensitivity then you need to apply an appropriate character set with a case-insensitive collation to the column:
SELECT old_id from text whereCONVERT(old_text USING latin1) like'%STRing%';
Be aware that if your table isn't small these queries will take a long time.
Solution 2:
As per the mediawiki documentation text table stores only the text for the revision. Hence to access the complete text, all revisions corresponding to a page need to be processed. It is better to use an API call to mediawiki search engine and process the results than search using SQL query.
Post a Comment for "How To Run A Query To Find A String In Blob Files?"