Tuesday, June 18, 2013

Where Did Wiki Scanner Go?

A guy named Virgil in early 2000's created a scandalous project called WikiScanner. All it does it look at anonymous Wiki edits for a certain ip ranges to figure out what companies were editing their own Wikipedia pages (obviously the ip addresses were cross referenced against a whois or something).

It doesn't seem to exist anymore. It got taken down I think. All I found is a bad quality copycat with injectable debugging statements:

Table 'wikiscanneres_wiki.org3' doesn't exist
SELECT name, ip_from from org3 where '2433878598' between ip_from and ip_to order by ip_from DESC limit 1

Yikes, TMI, error code...

Anyway, this is easy to replicate yourself using the Requests library in Python. Get a sample of the IP of the network you'd like to know more about, and take a walk around the "block":

(where xxx.xx.xx.xxx is the IP)

for i in range (0,256):
        r = requests.get('http://en.wikipedia.org/wiki/Special:Contributions/xxx.xx.xx.' + str(i))

Refine the crawler to look for related articles, with key words that are relative to your interest.

No comments:

Post a Comment