zoukankan      html  css  js  c++  java
  • spider

    from lxml import etree
    import requests
    def getHtml(html):
    novelcontent = requests.get(html).content
    return etree.HTML(novelcontent)

    source = getHtml("http://www.cabintu.com")

    listclassify = source.xpath('//ul[@class="sg_menu"]/li/a')
    listtype = source.xpath('//div[@class="mainleft"]/ul[@class="sg_menu"]/li[@class="section"]//ul[@class="subnav_a"]/li[@class="airline"]/a')

    for i in range(0,len(listclassify)-1):
    fname = source.xpath('//div[@class="mainleft"]/ul[@class="sg_menu"]/li[@class="section"]/a/text()')[i]
    print fname



    for n in range(0,len(listtype)-1):
    typelist = source.xpath('//div[@class="mainleft"]/ul[@class="sg_menu"]/li[@class="section"]//ul[@class="subnav_a"]/li[@class="airline"]/a/text()')[n]
    print typelist



    # for n in range(0,)


    # ftypelist = source.xpath('//div[@class="mainleft"]/ul[@class="sg_menu"]/li[@class="section"]/ul[@class="subnav_a"]/li[@class="airline"]/a/text()')[i]
  • 相关阅读:
    测试
    python制作
    Mysql 用法
    Day006 Java面向对象编程
    Day005 Java数组详解
    Day004 Java方法详解
    Day003 Java流程控制
    Day002 Java基础语法
    Day001 MrakDown语法 Dos命令
    4. 谈谈你对ArrayList和LinkedList 的理解
  • 原文地址:https://www.cnblogs.com/cutepython/p/6102824.html
Copyright © 2011-2022 走看看