Advanced Pentest & Offensive
OSINT — passive information gathering: Shodan, LinkedIn, WHOIS, Google dorks
OSINT (Open Source Intelligence) is the collection of information from publicly available sources. In pentesting, it is the safest phase: you learn about the target without sending a single packet against it.
WHOIS and DNS records
# Domain information
whois example.com
# Subdomain resolution
host -t mx example.com
dig example.com ANY
dig axfr @ns1.example.com example.com # attempt zone transfer
# Dictionary-based subdomain enumeration
subfinder -d example.com
amass enum -d example.com
Relevant data: registrant, emails, name servers, associated IPs, expiration dates.
TLS certificates — crt.sh
SSL certificates are public and reveal subdomains:
# Via browser
https://crt.sh/?q=%25.example.com
# Via curl (JSON)
curl -s "https://crt.sh/?q=%25.example.com&output=json" \
| jq '.[].name_value' | sort -u
Typical output:
"api.example.com"
"admin.example.com"
"staging.example.com"
"vpn.example.com"
Shodan — internet-exposed assets
Shodan indexes service banners across the entire internet:
Useful dorks:
org:"Example Corp" → all IPs owned by organization
hostname:example.com → subdomains with running services
ssl.cert.subject.cn:example.com → filter by TLS certificate
port:22 org:"Example Corp" → exposed SSH
http.title:"GitLab" → public GitLab instances
product:"Apache httpd" version:"2.4.49" → specific vulnerable version
CLI:
shodan search --fields ip_str,port,org "hostname:example.com"
Google Dorks
Advanced search operators to find sensitive files and pages:
Configuration files:
site:example.com filetype:env
site:example.com filetype:xml "password"
site:example.com filetype:sql
Admin panels:
site:example.com inurl:admin
site:example.com intitle:"phpMyAdmin"
site:example.com inurl:wp-admin
Exposed information:
site:example.com "Index of /"
site:example.com ext:log
site:example.com "DB_PASSWORD"
Cached and older versions:
cache:example.com/admin
LinkedIn and human sources
What to look for:
- Technology stack (job postings reveal versions in use)
Example: "Node.js 14 + Kubernetes 1.21 + AWS RDS position"
- IT staff names → social engineering targets
- Vendors and partners → supply chain attack surface
- Recently laid-off employees → potential insider risk
Tools:
theHarvester -d example.com -b linkedin
hunter.io → corporate emails by domain
Pastebin and leak sites
Search for leaked credentials:
site:pastebin.com "example.com"
site:github.com "example.com" "password"
Specialized services:
haveibeenpwned.com → emails found in known breaches
dehashed.com → search across multiple dumps
intelx.io → pastes, dark web, emails
Metadata analysis
Public documents (PDF, DOCX, XLSX) contain metadata:
exiftool document.pdf
Reveals:
Author: john.smith
Creator: Microsoft Word 2016
Producer: GPL Ghostscript 9.18
ModifyDate: 2024:03:15 14:22:10
Company: Example Corp
→ internal usernames, software versions, working hours
Automation with recon-ng and theHarvester
theHarvester -d example.com -b google,bing,linkedin,shodan -l 200
recon-ng:
marketplace install all
modules load recon/domains-hosts/hackertarget
options set SOURCE example.com
run
Solid OSINT saves hours of active scanning — and sometimes reveals the entry vector before any active tool does.