theHarvester
theHarvester is an open-source OSINT (Open Source Intelligence) tool designed to gather information about a target domain from various public sources. It is widely used during the reconnaissance phase of penetration testing to collect email addresses, subdomains, hosts, employee names, open ports, and banners from different public sources like search engines, PGP key servers, and the SHODAN database.
Here are the primary uses of theHarvester:
-
Email Address Enumeration: theHarvester excels at discovering email addresses associated with a target domain by querying search engines, social media platforms, and other public sources. This information is crucial for social engineering attacks, phishing campaigns, and identifying potential targets within an organization.
-
Subdomain Discovery: The tool identifies subdomains associated with the target domain by leveraging multiple data sources. This helps security professionals map the complete attack surface and discover hidden or forgotten web applications and services.
-
Employee Name Gathering: theHarvester collects employee names and job titles from various public sources, including LinkedIn and other professional networking sites. This information is valuable for crafting targeted social engineering attacks and understanding organizational structure.
-
Host and IP Discovery: The tool discovers hosts and IP addresses related to the target domain, providing insights into the organization's network infrastructure and potential entry points for further reconnaissance.
-
Virtual Host Enumeration: theHarvester can identify virtual hosts associated with target IP addresses, revealing multiple websites or applications hosted on the same infrastructure.
-
OSINT Aggregation: By querying multiple data sources simultaneously, theHarvester aggregates OSINT data from various platforms, providing a comprehensive view of publicly available information about the target organization.
Core Features
- Email Address Enumeration
- Subdomain Discovery
- Host and IP Discovery
- Employee Name Collection
- Virtual Host Detection
- Multiple Data Source Support
- DNS Brute-forcing
- DNS Reverse Lookup
- Port Scanning Integration
- Banner Grabbing
- Screenshot Capture
- Report Generation
- API Integration
Data sources
- Bing
- Yahoo
- Baidu
- DuckDuckGo
- Shodan
- Censys
- VirusTotal
- ThreatCrowd
- CrtSh (Certificate Transparency)
- DNSdumpster
- PGP Key Servers
- Hunter.io
- GitHub
- SecurityTrails
Common theHarvester Commands
1. Basic Domain Search
- This command performs a basic search for a target domain using a specified data source. It gathers emails, hosts, and subdomains from the selected source.
theharvester -d <domain> -b <source>
2. Search All Sources
- This command queries all available data sources to gather maximum information about the target domain. It provides comprehensive OSINT coverage.
theharvester -d <domain> -b all
3. Limit Results
- This command limits the number of results returned from search engines. It's useful for faster searches or avoiding rate limiting.
theharvester -d <domain> -b google -l 100
4. DNS Brute Force
- This command enables DNS brute-forcing to discover additional subdomains using a wordlist. It expands subdomain discovery beyond passive sources.
theharvester -d <domain> -b bing -c
5. DNS Reverse Lookup
- This command performs reverse DNS lookups on discovered IP addresses to find additional hostnames and domains hosted on the same infrastructure.
theharvester -d <domain> -b google -n
6. Virtual Host Verification
- This command verifies discovered hosts by checking if they respond to HTTP/HTTPS requests, filtering out false positives.
theharvester -d <domain> -b all -v
7. Shodan Integration
- This command integrates with Shodan to gather information about open ports and services on discovered hosts.
theharvester -d <domain> -b shodan
8. Take Screenshots
- This command captures screenshots of discovered web applications and hosts, providing visual confirmation of active services.
theharvester -d <domain> -b all -s
9. Port Scanning
- This command performs port scanning on discovered hosts using common ports to identify running services.
theharvester -d <domain> -b all -p
10. Output to File
- This command saves all gathered information to specified output files in various formats (HTML, XML, JSON).
theharvester -d <domain> -b all -f output
11. HTML Report
- This command generates an HTML report with all discovered information, providing a formatted and easily shareable output.
theharvester -d <domain> -b all -f report -h
12. Use Proxy
- This command routes all requests through a specified proxy server, useful for bypassing geographic restrictions or maintaining anonymity.
theharvester -d <domain> -b google -e <proxy>
13. DNS Server
- This command specifies a custom DNS server for lookups instead of using the system default resolver.
theharvester -d <domain> -b bing -r <dns_server>
14. Start at Result Number
- This command starts gathering results from a specific result number, useful for resuming interrupted searches or pagination.
theharvester -d <domain> -b google -l 500 -s 100
15. Timeout Configuration
- This command sets a custom timeout value for network requests, preventing the tool from hanging on slow connections.
theharvester -d <domain> -b all -t 30
16. API Key Configuration
- This command uses API keys for data sources that require authentication, enabling access to premium features and higher rate limits.
theharvester -d <domain> -b hunter -k <api_key>
17. Filter Subdomains
- This command filters out subdomains that don't resolve or aren't active, cleaning up the results list.
theharvester -d <domain> -b all -a
18. Help and Usage Information
- This command displays the help menu and usage information for theHarvester, listing all available options and data sources.
theharvester -h
Alternative usage:
theharvester --help
Output Examples of theHarvester Commands
| Command | Example Usage | Function | Output Example |
|---|---|---|---|
| Basic Search | theharvester -d example.com -b google | Searches Google for domain information. | [*] Searching Google... [*] Emails found: 5 [*] Hosts found: 12 |
| Search All Sources | theharvester -d example.com -b all | Queries all available sources. | [*] Starting with Google... [*] Starting with Bing... [*] Total: 45 emails, 67 hosts |
| Limit Results | theharvester -d example.com -b google -l 100 | Limits to 100 results. | [*] Searching Google (limit: 100) [*] Results: 100/100 |
| Email Discovery | theharvester -d example.com -b all | Discovers email addresses. | john.doe@example.com admin@example.com support@example.com |
| Subdomain Discovery | theharvester -d example.com -b all | Finds subdomains. | www.example.com mail.example.com dev.example.com |
| Host Discovery | theharvester -d example.com -b shodan | Discovers hosts and IPs. | 192.168.1.1 - example.com 192.168.1.2 - mail.example.com |
| DNS Brute Force | theharvester -d example.com -b bing -c | Performs DNS brute-forcing. | [*] Performing DNS brute force... [+] Found: admin.example.com |
| DNS Reverse Lookup | theharvester -d example.com -b google -n | Performs reverse lookups. | [*] Reverse lookup on 192.168.1.1 [+] Found: server.example.com |
| Virtual Host Check | theharvester -d example.com -b all -v | Verifies virtual hosts. | [*] Checking virtual hosts... [+] Active: www.example.com (200 OK) |
| Shodan Integration | theharvester -d example.com -b shodan | Uses Shodan database. | [*] Shodan search... [+] Open ports: 80, 443, 22 |
| Take Screenshots | theharvester -d example.com -b all -s | Captures screenshots. | [*] Taking screenshots... [+] Screenshot saved: www_example_com.png |
| Port Scanning | theharvester -d example.com -b all -p | Scans discovered hosts. | [*] Scanning 192.168.1.1 [+] Open: 80/tcp, 443/tcp, 22/tcp |
| HTML Report | theharvester -d example.com -b all -f report -h | Generates HTML report. | [*] Report saved to report.html |
| XML Output | theharvester -d example.com -b all -f output.xml | Saves in XML format. | [*] Results saved to output.xml |
| JSON Output | theharvester -d example.com -b all -f output.json | Exports to JSON format. | [*] Results saved to output.json |
| LinkedIn Search | theharvester -d example.com -b linkedin | Searches LinkedIn for employees. | [*] LinkedIn search... [+] John Doe - CEO [+] Jane Smith - CTO |
| Twitter Search | theharvester -d example.com -b twitter | Searches Twitter mentions. | [*] Twitter search... [+] @example_official |
| Certificate Search | theharvester -d example.com -b crtsh | Searches certificate transparency logs. | [*] Certificate Transparency search... [+] Found: *.example.com |
| Use Proxy | theharvester -d example.com -b google -e http://proxy:8080 | Routes through proxy. | [*] Using proxy: http://proxy:8080 |
| Custom DNS | theharvester -d example.com -b bing -r 8.8.8.8 | Uses Google DNS. | [*] Using DNS server: 8.8.8.8 |
| Start Position | theharvester -d example.com -b google -l 500 -s 100 | Starts from result 100. | [*] Starting from result: 100 |
| Timeout Setting | theharvester -d example.com -b all -t 30 | Sets 30-second timeout. | [*] Timeout set to: 30 seconds |
| API Key Usage | theharvester -d example.com -b hunter -k abc123 | Uses Hunter.io API key. | [*] Using Hunter.io API [+] API credits remaining: 50 |
| VirusTotal Search | theharvester -d example.com -b virustotal | Queries VirusTotal database. | [*] VirusTotal search... [+] Subdomains found: 15 |
| GitHub Search | theharvester -d example.com -b github | Searches GitHub repositories. | [*] GitHub search... [+] Repository: example/project |
| Summary Output | theharvester -d example.com -b all | Shows search summary. | [*] Summary: Emails: 25 Hosts: 50 IPs: 15 |