zoukankan      html  css  js  c++  java
  • 爬取网站内容,存放到excel中

    import requests
    import re
    import time
    import xlwt
    workbook=xlwt.Workbook(encoding='utf-8')
    worksheet=workbook.add_sheet('update_resource')
    k=0
    for i in range(320):
    time.sleep(3)
    obj = re.compile(
    '<span style="font-weight:400;color:#333333;font-size:14px;cursor:pointer;" >[d*](.*?)</span>.*?文件大小:(.*?);.*?<span style="font-weight:400;cursor:pointer;" >[(.*?)]</span></a>',
    re.S)
    url = f'http://gaoqing.3zitie.cn/prolist_new.asp?bid=0&lid=0&page={i}&searchkey='
    res = requests.get(url)
    ret=res.content.decode('utf-8')
    rem=obj.findall(ret,re.S)
    with open('source',mode='a',encoding='utf-8') as f:
    for s in rem:
    worksheet.write(k, 0, label=s[0])
    worksheet.write(k, 1, label=s[1])
    worksheet.write(k, 2, label=s[2])
    k+=1
    workbook.save('new_excel_814')
  • 相关阅读:
    HAOI2018 奇怪的背包
    HAOI2018 苹果树
    骑士游戏
    飞飞侠
    奶牛排队
    寻找道路
    [USACO08JAN]牛大赛Cow Contest
    灾后重建
    [USACO09DEC]牛收费路径Cow Toll Paths
    萌萌哒
  • 原文地址:https://www.cnblogs.com/diracy/p/13502429.html
Copyright © 2011-2022 走看看