Apache Tika XXE Vulnerability CVE-2025-66516 Explained
CREATED ON January 09, 2026
A high-severity security flaw has been identified in Apache Tika, the open-source framework widely utilized for document parsing and metadata extraction. Tracked as CVE-2025-66516 with a CVSS score of 8.4, this vulnerability enables XML External Entity (XXE) attacks via malicious XFA files embedded within PDFs.
This disclosure supersedes the previously reported CVE-2025-54988; investigation revealed that the root cause resides within the tika-core library rather than the PDF module alone, effectively expanding the scope of affected packages. Because the vulnerability exists in the core logic, upgrading only the PDF parser module is insufficient to resolve the issue.
Systems processing untrusted files are susceptible to significant risks, including the exfiltration of sensitive local files, server-side request forgery (SSRF), and denial of service (DoS). To remediate this threat, it is essential to upgrade tika-core to version 3.2.2 or later, as earlier versions (1.13 through 3.2.1) remain vulnerable to exploitation.
What Is the Apache Tika?
Apache Tika is an open-source framework designed to detect and extract metadata and text from a vast array of file formats. It is capable of processing over a thousand different file types, including common formats like PowerPoint (PPT), Excel (XLS), and Portable Document Format (PDF). Because of its versatility, Tika has become a fundamental component in document processing workflows, content analysis systems, and search indexing engines across various sectors, including finance, media, legal, and government.
What Is the XXE (XML External Entity) Injection Attack?
An XML External Entity (XXE) attack targets applications that parse XML input. This vulnerability arises when an XML parser is weakly configured and processes input containing a reference to an external entity. The XML 1.0 standard defines "entities" as storage units, which can be external (storage outside the document). These external entities are accessed via a declared system identifier, typically a URI, which the processor attempts to dereference.
If an attacker includes tainted data in the system identifier and the XML processor dereferences it, the system may disclose confidential information that the application should not expose. Beyond data disclosure, successful XXE attacks can lead to Denial of Service (DoS), Server-Side Request Forgery (SSRF), and port scanning from the perspective of the server. In scenarios where the parser has client-side memory corruption flaws, dereferencing a malicious URI could even result in arbitrary code execution.
Below is a simple example payload of an XXE vulnerability, which discloses the /etc/passwd file of the target server [1]:
|
<?xml version="1.0" encoding="ISO-8859-1"?> |
How Does the CVE-2025-66516 Exploit Work?
CVE-2025-66516 is a high-severity vulnerability that has been identified in Apache Tika. This flaw carries a CVSS score of 8.4 and enables attackers to execute XML External Entity (XXE) attacks by embedding a maliciously crafted XFA file inside a PDF [2]. If processed by a vulnerable version of Tika, this can lead to unauthorized information disclosure or denial of service.
Below is shown a generic example of the XXE payload to exploit this vulnerability:
|
PUT /<Tika Endpoint> HTTP/1.1 |
After the attacker sends this request, the Apache Tika server receives the PDF and initiates the parsing process. It identifies the embedded XFA form data and passes the XML stream to the underlying parser.
Because the XML parser processes the malicious <!DOCTYPE> declaration. The parser executes the instruction to resolve the system entity &xxe;, reading the contents of the targeted local file (e.g., /etc/passwd) directly from the server's filesystem.
The sensitive file content is then substituted into the XML structure wherever the entity is referenced.
What Is the Remediation for CVE-2025-66516?
The following modules and version ranges are affected on all platforms [3]:
- Apache Tika core (org.apache.tika:tika-core) 1.13 through 3.2.1
- Apache Tika parsers (org.apache.tika:tika-parsers) 1.13 before 2.0.0
- Apache Tika PDF parser module (org.apache.tika:tika-parser-pdf-module) 2.0.0 through 3.2.1
To fully mitigate this issue, you must ensure that tika-core is also upgraded to version 3.2.2 or higher. Upgrading the tika-parser-pdf-module alone is insufficient.
How Picus Helps Simulate Apache Tika CVE-2025-66516 Attacks?
We also strongly suggest simulating the Apache Tika CVE-2025-66516 vulnerability to test the effectiveness of your security controls against sophisticated cyber attacks using the Picus Security Validation Platform. You can also test your defenses against other vulnerability exploitation attacks, such as regreSSHion, Citrix Bleed, and Follina, within minutes with a 14-day free trial of the Picus Platform.
Picus Threat Library includes the following threats for Apache Tika CVE-2025-66516 vulnerability exploitation attacks:
|
Threat ID |
Threat Name |
Attack Module |
|
74403 |
Apache Tika Web Attack Campaign |
Web Application |
Start simulating emerging threats today and get actionable mitigation insights with a 14-day free trial of the Picus Security Validation Platform.
Key Takeaways
- CVE-2025-66516 is a high-severity vulnerability in the Apache Tika framework with a CVSS score of 8.4 that enables XML External Entity (XXE) attacks via malicious XFA files embedded within PDFs.
- The root cause of this flaw resides within the tika-core library rather than the PDF module alone, meaning that upgrading only the PDF parser module is insufficient to resolve the issue.
- Exploitation occurs when the XML parser processes a malicious !DOCTYPE declaration within an embedded XFA form, causing the system to resolve external entities and potentially disclose sensitive local files like /etc/passwd.
- Successful attacks can lead to severe consequences for systems processing untrusted files, including the exfiltration of sensitive data, Server-Side Request Forgery (SSRF), and Denial of Service (DoS).
- To fully remediate the threat, organizations must upgrade tika-core to version 3.2.2 or later, as versions 1.13 through 3.2.1 remain vulnerable on all platforms.
- The Picus Security Validation Platform allows organizations to simulate Apache Tika CVE-2025-66516 attacks using Threat ID 74403 to validate security controls against this specific vulnerability.
References
[1] “XML External Entity (XXE) Processing.” Accessed: Jan. 07, 2026. [Online]. Available: https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing
[2] “CVE-2025-66516” Accessed: Jan. 07, 2026. [Online]. Available: https://www.cve.org/CVERecord?id=CVE-2025-66516
[3] “CVE-2025-66516: Apache Tika core, Apache Tika parsers, Apache Tika PDF parser module: Update to CVE-2025-54988 to expand scope of artifacts affected” Accessed: Jan. 07, 2026. [Online]. Available: https://lists.apache.org/thread/s5x3k93nhbkqzztp1olxotoyjpdlps9k
