zoukankan      html  css  js  c++  java
  • 豆瓣图书短评爬取(其中一本书的短评<前十页>)

    原文章在我的csdn上:https://blog.csdn.net/Thefreelittle/article/details/117574096

    ```python
    import requests
    from bs4 import BeautifulSoup
    import time
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36'}
    print("豆瓣图书爬取---流浪地球。")
    num = 1
    for i in range (0,199,20):
    time.sleep(3)
    if i == 0:
    url = 'https://book.douban.com/subject/3266609/comments/?limit=20&status=P&sort=new_score'
    else:
    url = 'https://book.douban.com/subject/3266609/comments/?start='+str(i)+'&limit=20&status=P&sort=new_score'
    resp = requests.get(url, headers=headers)
    bs=BeautifulSoup(resp.text,'html.parser')
    grid_view=bs.find_all('li',class_="comment-item")#里面的每个li表示一个影片数据
    print("------------------第"+str(num) +"页评论信息爬取。输出样例(点赞数、用户名称、评论时间、评论内容)------------------")
    cishu = 1
    for item in grid_view:
    piaoshu = item.find('span',class_="vote-count").text
    tzuozhe = item.find('span',class_="comment-info")
    zuozhe = tzuozhe.find('a').text
    shijian = item.find('span',class_="comment-time").text
    comment = item.find('span',class_="short").text
    
    ping = tzuozhe.find('span')
    if len(str(ping)) != 60:
    pingfen = "5个星"
    else:
    if ping.get('title') == "还行":
    pingfen = "3个星"
    elif ping.get('title') == "力荐":
    pingfen = "5个星"
    elif ping.get('title') == "推荐":
    pingfen = "4个星"
    elif ping.get('title') == "较差":
    pingfen = "2个星"
    else:
    pingfen = "1个星"
    
    print(""+str(num)+"页的第"+str(cishu)+"条评论---"+"点赞数:"+str(piaoshu)+" 作者名称:"+str(zuozhe)+" 评论时间:"+str(shijian)+" 评分:"+pingfen+" 评论内容:"+str(comment)+"
    ")
    cishu += 1
    num += 1
    
    ```
  • 相关阅读:
    Codeforces 1045C Hyperspace Highways (看题解) 圆方树
    Codeforces 316E3 线段树 + 斐波那切数列 (看题解)
    Codeforces 803G Periodic RMQ Problem 线段树
    Codeforces 420D Cup Trick 平衡树
    Codeforces 295E Yaroslav and Points 线段树
    Codeforces 196E Opening Portals MST (看题解)
    Codeforces 653F Paper task SA
    Codeforces 542A Place Your Ad Here
    python基础 异常与返回
    mongodb 删除
  • 原文地址:https://www.cnblogs.com/dazhi151/p/14911220.html
Copyright © 2011-2022 走看看