python - lxml iterparse fills memory despite on clear -
I am trying to parse XML before it works correctly, but to fill the second memory If you remove the first installers, nothing will change XML is valid.
def clear_element (e): e.clear () while e.getprevious () nobody is: del e.getparent () [0] def import_xml (request): for event f = 'File.xml' offers = etree.iterparse (f, events = ('end',), tag = 'offer'), offers in the offer: # processing # correctly for clear_advertisement (offers) categories = entry event Categories: Categories in category: # Memory Speed_Element (Category) by using
XML:
& lt; Shop & gt; & Lt; Categories & gt; & Lt; Category & gt; Name & lt; / Category & gt; & Lt; Category & gt; Name & lt; / Category & gt; & Lt; Category & gt; Name & lt; / Category & gt; ~ 1000 categories & lt; / Categories & gt; & Lt; Proposals & gt; & Lt; Offer & gt; & Lt; Inner_tag & gt; Data & lt; / Inner_tag & gt; & Lt; Inner_tag & gt; Data & lt; / Inner_tag & gt; & Lt; / Proposal & gt; & Lt; Offer & gt; & Lt; Inner_tag & gt; Data & lt; / Inner_tag & gt; & Lt; Inner_tag & gt; Data & lt; / Inner_tag & gt; & Lt; / Proposal & gt; ~ 450000 offers & lt; / Offers & gt; & Lt; / Shop & gt;
You are parsing the file twice, for the first time, Code> category tag and drop offers
tags for which 1000 category
tags do not take that memory.
But by placing all 450000 Proposal
tags, you leave only the category
tag, so the construction of the tree will require a lot of storage.
In the case of this, it is better that tag
not to be used for iterparse
and check tag name except for all unnecessary tags: / P>
def import_xml (request): F = 'file.xml' element = etree.iterparse (f, events = ('end',)) for the event, element in elements: if element. Tag == 'offer': # handle offer .. alif element.tag == 'category': # handle category ... and: release element.clear () element.getparent (). Then (element)
Note: Calling element.clear ()
still clear in memory as part of the construction without removing it from the parents The trees will leave the tree. Perhaps obvious
is not really necessary ...
Comments
Post a Comment