How to Find Subdomains & Sensitive Files: Pentest Guide

How to Discover Subdomains and Sensitive Files in an Ethical Hacking Lab

In the world of ethical hacking and penetration testing, the “Information Gathering” phase often determines the success or failure of an engagement. Two of the most rewarding techniques in this phase are Subdomain Enumeration and Directory Discovery. By identifying hidden subdomains and sensitive files, a researcher can expand the target’s attack surface, finding entry points that the web administrator might have forgotten to secure.

This guide will walk you through using Knockpy for subdomain discovery and Dirb for uncovering hidden files and directories within your hacking lab.

Are you a visual person. Watch the video step by step tutorial here:


The Importance of Subdomain Discovery

A subdomain is the prefix added to a main domain name (e.g., dev.google.com or mail.google.com). While the main domain might be heavily fortified, subdomains often lead to completely different web applications or legacy systems that are less secure.

Why Target Subdomains?

Subdomains are often “gold mines” for ethical hackers for several reasons:

  1. Management Pages: Admins frequently host control panels or database management interfaces (like phpMyAdmin) on subdomains.
  2. Development Environments: Developers use subdomains to test new features. These “experimental” versions often contain unpatched bugs or lack the security headers of the production site.
  3. Privilege Escalation: Gaining access to a weak subdomain can provide a foothold on the server. Once inside, a hacker can attempt to escalate privileges to control the entire web server.
  4. Sensitive Data Leakage: Subdomains may host staging sites containing real user data or configuration files that shouldn’t be public.

How to Use Knockpy for Subdomain Enumeration

Knockpy is a Python-based tool designed to enumerate subdomains on a target domain through a wordlist. It is highly effective because it can perform both passive reconnaissance and active brute-forcing.

How to Download and Install Knockpy

To ensure you are downloading a safe and updated version, go to Google.com and search for “knockpy github.”

Follow these terminal commands to set it up:

  1. Navigate to your Desktop:
cd Desktop/

2. Clone the repository: 

    git clone https://github.com/santiko/KnockPy.git

    3. Navigate into the folder and check the help menu:

    • cd knockpy
    • knockpy –help

    4. H3: Running the Scan (Recon vs. Brute Force)

      Knockpy offers two primary ways to find subdomains:

      • –recon: This flag tells Knockpy to perform passive discovery. It is incredibly fast but relies on external sources, so it might miss hidden internal subdomains.
      • –brute: This option uses a wordlist to actively “guess” subdomain names. While it takes longer and makes more requests to the target, it is much more thorough.

      Example Command:
      To run a fast reconnaissance scan on a domain like Google:

      knockpy --domain google.com --recon

      Within seconds, you will receive a list of discovered subdomains and their corresponding IP addresses.


      Discovering Sensitive Files and Directories

      Once you have identified your target domains, the next step is to look “under the hood.” Web applications often contain files and directories that are not linked anywhere on the homepage but are still accessible if you know the URL.

      The Lab Environment: Metasploitable & Mutillidae

      In our #cybersecurenation lab, we use Metasploitable. Learn how to install metasploitable on your ethical hacking lab. If you look at the server structure (located in /var/www), you will find a directory called Mutillidae. This is a deliberately vulnerable web application designed specifically for practicing penetration testing.

      Using DIRB for Directory Brute-Forcing

      Dirb is a URL content scanner. It works by launching a brute-force attack against a web server using a wordlist and analyzing the HTTP responses.

      To learn the full syntax, run:

      man dirb

      To scan the Mutillidae application on your Metasploitable IP (replace with your Metasploit IP address), use:

      dirb http://192.168.0.85/mutillidae

      Analyzing Sensitive Files for Exploitation

      When Dirb finishes its scan, it will provide a list of URLs. As an ethical hacker, you should look for the following high-value targets:

      1. phpinfo.php

      This file is an information disclosure vulnerability. It reveals the PHP version, server OS, and loaded modules. This helps you choose the specific exploit needed for the target’s exact environment. The scan detected phpinfo.php file.

      2. phpMyAdmin

      The Dirb scan found /multillidae/phpMyAdmin, we have found the database login portal. This is a prime target for credential testing.

      3. The robots.txt File (The Treasure Map)

      robots.txt is a file created by web admins to tell search engines (like Google or Yahoo) which parts of the site not to index. Ironically, this serves as a roadmap for hackers. If an admin is trying to hide a directory from Google, it’s probably because it contains something sensitive.

      Scan detected robots.tx. Following the steps to access the username and password:

      1. You find robots.txt on the server.

      2. Copy link paste in browser you see a disallowed path: /passwords.

      http://192.168.0.85/multilldae/robots.txt

        3. You navigate to [IP]/passwords and find a file called accounts.txt.

        http://192.168.0.85/multilldae/passwords/

          4. Opening accounts.txt reveals usernames and passwords for the site.

          http://192.168.0.85/multilldae/passwords/accounts.txt

            4. config.inc and Configuration Files

            Files like config.inc often contain the hardcoded credentials used to connect the web application to the backend database. Gaining these credentials can lead to a full database breach. Our scan was about to detect database host, database user , database password and database name.

            http://192.168.0.85/multilldae/config.inc

            Pro-Tip: Always save the usernames and passwords you find in a text file. You can use this custom list later to perform a brute-force attack on other parts of the system.


            Conclusion

            Mastering subdomain discovery with Knockpy and directory scanning with Dirb turns a single URL into a massive landscape of potential vulnerabilities. By identifying what an admin intended to hide—whether it’s an experimental subdomain or a hidden passwords folder—you gain the upper hand in the reconnaissance phase.

            Remember, the goal of an ethical hacker is to find these weaknesses so they can be patched before a malicious actor finds them. Practice these techniques in your local lab and keep refining your wordlists for better results.

            Did you find any “hidden gems” in your robots.txt scan? Which tool do you prefer for subdomain hunting? Let us know in the comments below!

            If you enjoyed this guide, please like and share it with your community to help us grow the #cybersecurenation!

            Leave a Reply

            Hello!

            Welcome to Cybersecuritynation.com, your premier hub for professional security training. Explore our free ethical hacking courses on this blog and our YouTube channel, @cybersecurenation. We provide exclusive free tools and expert tutorials on Python coding for security. Subscribe today to master the digital frontier and protect the future.

            Let’s connect

            Discover more from Cyber Security Nation

            Subscribe now to keep reading and get access to the full archive.

            Continue reading