zoukankan      html  css  js  c++  java
  • 全书网

    import requests
    import re
    from lxml import etree
    start_url ='http://www.quanshuwang.com/list/5_{}.html'
    book_page=1
    # 1到128页书的网址
    for book_page in range(1):
    url =start_url.format(book_page)
    response =requests.get(url).content.decode('gbk')
    # print(response)
    book_html= ' <li><a target="_blank" href="(.*?)" class="l mr10">'

    # book_title ='alt="(.*?)"'
    book_title ='<a target="_blank" title="(.*?)" '
    re_html= re.findall(book_html,response)
    re_title =re.findall(book_title,response)
    # for title,html in zip(re_title,re_html):
    # print(title,html)

    #range(1,20)书的数量,1到20的书籍数量
    for j in range(1,20):

    response_book =requests.get(re_html[j]).content.decode('gbk')
    read_start=' <a href="(.*?)" class="reader" title="(.*?)">开始阅读</a>'
    read_menu=re.findall(read_start,response_book)
    print(read_menu[0][0])
    print(read_menu[0][1])
    menu_url =requests.get(read_menu[0][0]).content.decode('gbk')
    print(menu_url)
    book_mulu='<DIV class="dirtitone"><H2>(.*?)</H2></div>'
    print(re.findall(book_mulu,menu_url))
  • 相关阅读:
    03-django模型(1)
    Django02 Django基础知识
    jquery
    Django01 web http 基础
    Python之协程
    Python之线程
    进程 操作
    CSS
    UVALive-3268 Jamie's Contact Groups (最大流,网络流建模)
    UVA-10806 Dijkstra, Dijkstra. (最小费用流,网络流建模)
  • 原文地址:https://www.cnblogs.com/LQ970811/p/10508680.html
Copyright © 2011-2022 走看看