Parsing untrusted XML files with a weakly configured XML parser may lead to attacks such as XML External Entity (XXE), Billion Laughs, Quadratic Blowup and DTD retrieval. This type of attack uses external entity references to access arbitrary files on a system, carry out denial of service, or server side request forgery. Even when the result of parsing is not returned to the user, out-of-band data retrieval techniques may allow attackers to steal sensitive data. Denial of services can also be carried out in this situation.

Use defusedxml, a Python package aimed to prevent any potentially malicious operation.

The following example calls xml.etree.ElementTree.fromstring using a parser (lxml.etree.XMLParser) that is not safely configured on untrusted data, and is therefore inherently unsafe.

Providing an input (xml_content) like the following XML content against /bad, the request response would contain the contents of /etc/passwd.

  • Python 3 XML Vulnerabilities.
  • Python 2 XML Vulnerabilities.
  • Python XML Parsing.
  • OWASP vulnerability description: XML External Entity (XXE) Processing.
  • OWASP guidance on parsing xml files: XXE Prevention Cheat Sheet.
  • Paper by Timothy Morgen: XML Schema, DTD, and Entity Attacks
  • Out-of-band data retrieval: Timur Yunusov & Alexey Osipov, Black hat EU 2013: XML Out-Of-Band Data Retrieval.
  • Denial of service attack (Billion laughs): Billion Laughs.