What vulnerability does this XML parsing code have and how do you fix it?

Question

Accepted Answer

XML External Entity XXE injection. The lxml parser is called with no security configuration, so it uses its defaults — which include resolving external entities. An attacker uploads XML containing a DOCTYPE declaration that defines an entity pointing to file:///etc/passwd or an internal service URL. When lxml parses the document, it reads that file and splices its contents into the XML tree, which the application then returns or logs. The fix disables all three dangerous features: resolveentities=False don't substitute entity references, nonetwork=True don't fetch remote URLs, and loaddtd=False don't even parse DOCTYPE declarations. These three options together prevent all known XXE vectors in lxml.