zoukankan      html  css  js  c++  java
  • python xml DOM 解析例子

    import xml.dom.minidom
    
    document = """\
    <slideshow>
    <title>Demo slideshow</title>
    <slide><title>Slide title</title>
    <point>This is a demo</point>
    <point>Of a program for processing slides</point>
    </slide>
    
    <slide><title>Another demo slide</title>
    <point>It is important</point>
    <point>To have more than</point>
    <point>one slide</point>
    </slide>
    </slideshow>
    """
    
    dom = xml.dom.minidom.parseString(document)
    
    def getText(nodelist):
        rc = []
        for node in nodelist:
            if node.nodeType == node.TEXT_NODE:
                rc.append(node.data)
        return ''.join(rc)
    
    def handleSlideshow(slideshow):
        print "<html>"
        handleSlideshowTitle(slideshow.getElementsByTagName("title")[0])
        slides = slideshow.getElementsByTagName("slide")
        handleToc(slides)
        handleSlides(slides)
        print "</html>"
    
    def handleSlides(slides):
        for slide in slides:
            handleSlide(slide)
    
    def handleSlide(slide):
        handleSlideTitle(slide.getElementsByTagName("title")[0])
        handlePoints(slide.getElementsByTagName("point"))
    
    def handleSlideshowTitle(title):
        print "<title>%s</title>" % getText(title.childNodes)
    
    def handleSlideTitle(title):
        print "<h2>%s</h2>" % getText(title.childNodes)
    
    def handlePoints(points):
        print "<ul>"
        for point in points:
            handlePoint(point)
        print "</ul>"
    
    def handlePoint(point):
        print "<li>%s</li>" % getText(point.childNodes)
    
    def handleToc(slides):
        for slide in slides:
            title = slide.getElementsByTagName("title")[0]
            print "<p>%s</p>" % getText(title.childNodes)
    
    handleSlideshow(dom)

    运行结果:

    <html>
    <title>Demo slideshow</title>
    <p>Slide title</p>
    <p>Another demo slide</p>
    <h2>Slide title</h2>
    <ul>
    <li>This is a demo</li>
    <li>Of a program for processing slides</li>
    </ul>
    <h2>Another demo slide</h2>
    <ul>
    <li>It is important</li>
    <li>To have more than</li>
    <li>one slide</li>
    </ul>
    </html>

  • 相关阅读:
    python应用之文件属性浏览
    python进阶之路之文件处理
    magento安装时的数据库访问错误
    magento麦进斗客户地址属性不保存在sales_flat_order_address
    自动填写麦进斗Magento进货地址字段
    麦进斗magentoRequireJs回调失败
    如何在麦进斗magento2中调用站外的JS?
    在magento1.9结账地址中删除验证
    麦进斗:在windows系统里面刷新magento2的缓存
    如何安装麦进斗Magento2
  • 原文地址:https://www.cnblogs.com/hzhida/p/2655768.html
Copyright © 2011-2022 走看看