zoukankan      html  css  js  c++  java
  • Python爬虫基础(三)--将爬虫获取到的数据写入到csv

    爬虫需要将网页获取的结果保存下来,现在先学习csv保存数据;

    最终实现代码:

    import requests
    from bs4 import BeautifulSoup
    import csv
    
    def db():
        url = "https://www.douban.com/group/"
        headers = {
            "User-Agent":"Mozilla/5.0",
            "Cookie":''
        }
        ret = requests.get(url,headers = headers)
        return ret.content
    
    #解析网页,并获取帖子的url、标题
    def get_data(lst,html_data):
        soup = BeautifulSoup(html_data,"html.parser")
        for i in soup.find_all("a",attrs="title"):
            lst.append([i.attrs["href"],i.attrs["title"]])
    
    #保存url、标题到csv文件中
    def save_to_csv(lst):
        with open('test.csv','w',newline='',encoding='utf-8')as f:
            f_csv = csv.writer(f)
            for data in lst:
                f_csv.writerow(data)
    
    def main():
        Html = db()
        lst = []
        get_data(lst,Html)
        save_to_csv(lst)
    
    main()
  • 相关阅读:
    PHP中关于字符串的连接
    好用的FireFox(FF)插件
    Scripted Operation
    Scripted device
    chgrp chown
    wait_for_devices
    mysql create user
    mysql
    create user mysql
    Inserting/Removing shutters and filters
  • 原文地址:https://www.cnblogs.com/james-danni/p/11848494.html
Copyright © 2011-2022 走看看