zoukankan      html  css  js  c++  java
  • 豆瓣图书短评爬取(其中一本书的短评<前十页>)

    原文章在我的csdn上:https://blog.csdn.net/Thefreelittle/article/details/117574096

    ```python
    import requests
    from bs4 import BeautifulSoup
    import time
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36'}
    print("豆瓣图书爬取---流浪地球。")
    num = 1
    for i in range (0,199,20):
    time.sleep(3)
    if i == 0:
    url = 'https://book.douban.com/subject/3266609/comments/?limit=20&status=P&sort=new_score'
    else:
    url = 'https://book.douban.com/subject/3266609/comments/?start='+str(i)+'&limit=20&status=P&sort=new_score'
    resp = requests.get(url, headers=headers)
    bs=BeautifulSoup(resp.text,'html.parser')
    grid_view=bs.find_all('li',class_="comment-item")#里面的每个li表示一个影片数据
    print("------------------第"+str(num) +"页评论信息爬取。输出样例(点赞数、用户名称、评论时间、评论内容)------------------")
    cishu = 1
    for item in grid_view:
    piaoshu = item.find('span',class_="vote-count").text
    tzuozhe = item.find('span',class_="comment-info")
    zuozhe = tzuozhe.find('a').text
    shijian = item.find('span',class_="comment-time").text
    comment = item.find('span',class_="short").text
    
    ping = tzuozhe.find('span')
    if len(str(ping)) != 60:
    pingfen = "5个星"
    else:
    if ping.get('title') == "还行":
    pingfen = "3个星"
    elif ping.get('title') == "力荐":
    pingfen = "5个星"
    elif ping.get('title') == "推荐":
    pingfen = "4个星"
    elif ping.get('title') == "较差":
    pingfen = "2个星"
    else:
    pingfen = "1个星"
    
    print(""+str(num)+"页的第"+str(cishu)+"条评论---"+"点赞数:"+str(piaoshu)+" 作者名称:"+str(zuozhe)+" 评论时间:"+str(shijian)+" 评分:"+pingfen+" 评论内容:"+str(comment)+"
    ")
    cishu += 1
    num += 1
    
    ```
  • 相关阅读:
    C#基础
    自动化测试
    C# 数据结构题目
    .NET基础知识
    Sharepoint题目
    题目总结2
    数据库索引
    题目总结(2014-1-10)
    Stack详解
    SpringBoot入门基础知识点
  • 原文地址:https://www.cnblogs.com/dazhi151/p/14911220.html
Copyright © 2011-2022 走看看