When parsing XML with external entity resolution enabled, pressing Ctrl+C (or raising KeyboardInterrupt / SystemExit inside a content handler) is silently swallowed and converted into a generic SAXParseException.
The root cause is a bare except: in ExpatParser.external_entity_ref() at Lib/xml/sax/expatreader.py line 427:
try:
xmlreader.IncrementalParser.parse(self, source)
except:
return 0 # FIXME: save error info here?
The bare except: catches everything — KeyboardInterrupt, SystemExit, MemoryError — and returns 0 to expat, which then raises a generic "error in processing external entity reference". The inline FIXME notes the error info is lost, but the broader issue is that BaseException subclasses like KeyboardInterrupt should never be caught here at all.
There's also a secondary issue: the _entity_stack cleanup (lines 430–431) only runs on success, so the parser's internal state is corrupted after any error during entity parsing.
Reproduction
import xml.sax
from xml.sax.handler import feature_external_ges
from xml.sax import ContentHandler
from xml.sax.xmlreader import InputSource
from io import BytesIO
class KBHandler(ContentHandler):
def startElement(self, name, attrs):
if name == 'entity':
raise KeyboardInterrupt('simulated Ctrl+C')
class Resolver:
def resolveEntity(self, pubId, sysId):
src = InputSource()
src.setByteStream(BytesIO(b'<entity/>'))
return src
parser = xml.sax.make_parser()
parser.setFeature(feature_external_ges, True)
parser.setEntityResolver(Resolver())
parser.setContentHandler(KBHandler())
try:
parser.feed('<!DOCTYPE d [<!ENTITY e SYSTEM "x">]><d>&e;</d>')
parser.close()
except KeyboardInterrupt:
print('GOOD: KeyboardInterrupt propagated')
except xml.sax.SAXParseException as e:
print(f'BUG: KeyboardInterrupt became SAXParseException: {e}')
Output: BUG: KeyboardInterrupt became SAXParseException: <unknown>:1:6: error in processing external entity reference
Suggested fix
Change except: to except Exception: and move the _entity_stack cleanup into a finally block. I checked pyexpat.c — the C layer handles Python exception propagation correctly through call_with_frame() / XML_StopParser() / get_parse_result(), so letting KeyboardInterrupt pass through is safe.
The other bare except: in the same file (line 104 in parse()) correctly re-raises after cleanup, so this is not a deliberate pattern.
Linked PRs
When parsing XML with external entity resolution enabled, pressing Ctrl+C (or raising
KeyboardInterrupt/SystemExitinside a content handler) is silently swallowed and converted into a genericSAXParseException.The root cause is a bare
except:inExpatParser.external_entity_ref()atLib/xml/sax/expatreader.pyline 427:The bare
except:catches everything —KeyboardInterrupt,SystemExit,MemoryError— and returns 0 to expat, which then raises a generic "error in processing external entity reference". The inline FIXME notes the error info is lost, but the broader issue is thatBaseExceptionsubclasses likeKeyboardInterruptshould never be caught here at all.There's also a secondary issue: the
_entity_stackcleanup (lines 430–431) only runs on success, so the parser's internal state is corrupted after any error during entity parsing.Reproduction
Output:
BUG: KeyboardInterrupt became SAXParseException: <unknown>:1:6: error in processing external entity referenceSuggested fix
Change
except:toexcept Exception:and move the_entity_stackcleanup into afinallyblock. I checked pyexpat.c — the C layer handles Python exception propagation correctly throughcall_with_frame()/XML_StopParser()/get_parse_result(), so lettingKeyboardInterruptpass through is safe.The other bare
except:in the same file (line 104 inparse()) correctly re-raises after cleanup, so this is not a deliberate pattern.Linked PRs