zoukankan      html  css  js  c++  java
  • python xml DOM 解析例子

    import xml.dom.minidom
    
    document = """\
    <slideshow>
    <title>Demo slideshow</title>
    <slide><title>Slide title</title>
    <point>This is a demo</point>
    <point>Of a program for processing slides</point>
    </slide>
    
    <slide><title>Another demo slide</title>
    <point>It is important</point>
    <point>To have more than</point>
    <point>one slide</point>
    </slide>
    </slideshow>
    """
    
    dom = xml.dom.minidom.parseString(document)
    
    def getText(nodelist):
        rc = []
        for node in nodelist:
            if node.nodeType == node.TEXT_NODE:
                rc.append(node.data)
        return ''.join(rc)
    
    def handleSlideshow(slideshow):
        print "<html>"
        handleSlideshowTitle(slideshow.getElementsByTagName("title")[0])
        slides = slideshow.getElementsByTagName("slide")
        handleToc(slides)
        handleSlides(slides)
        print "</html>"
    
    def handleSlides(slides):
        for slide in slides:
            handleSlide(slide)
    
    def handleSlide(slide):
        handleSlideTitle(slide.getElementsByTagName("title")[0])
        handlePoints(slide.getElementsByTagName("point"))
    
    def handleSlideshowTitle(title):
        print "<title>%s</title>" % getText(title.childNodes)
    
    def handleSlideTitle(title):
        print "<h2>%s</h2>" % getText(title.childNodes)
    
    def handlePoints(points):
        print "<ul>"
        for point in points:
            handlePoint(point)
        print "</ul>"
    
    def handlePoint(point):
        print "<li>%s</li>" % getText(point.childNodes)
    
    def handleToc(slides):
        for slide in slides:
            title = slide.getElementsByTagName("title")[0]
            print "<p>%s</p>" % getText(title.childNodes)
    
    handleSlideshow(dom)

    运行结果:

    <html>
    <title>Demo slideshow</title>
    <p>Slide title</p>
    <p>Another demo slide</p>
    <h2>Slide title</h2>
    <ul>
    <li>This is a demo</li>
    <li>Of a program for processing slides</li>
    </ul>
    <h2>Another demo slide</h2>
    <ul>
    <li>It is important</li>
    <li>To have more than</li>
    <li>one slide</li>
    </ul>
    </html>

  • 相关阅读:
    POJ3345 Bribing FIPA(树形DP)
    POJ3294 Life Forms(二分+后缀数组)
    ZOJ1027 Travelling Fee(DP+SPFA)
    POJ2955 Brackets(区间DP)
    POJ1655 Balancing Act(树的重心)
    POJ2774 Long Long Message(后缀数组)
    URAL1297 Palindrome(后缀数组)
    SPOJ705 SUBST1
    POJ3261 Milk Patterns(二分+后缀数组)
    POJ1743 Musical Theme(二分+后缀数组)
  • 原文地址:https://www.cnblogs.com/hzhida/p/2655768.html
Copyright © 2011-2022 走看看