zoukankan      html  css  js  c++  java
  • 爬取校园新闻

    str = requests.get('http://news.gzcc.cn/html/xiaoyuanxinwen/')

    str.encoding = 'utf-8'
     
    from bs4 import BeautifulSoup
    soup=BeautifulSoup(str.text,'html.parser')
    #print(soup)
     
    d=soup.select('li')
    for news in d:
        if len(news.select('.news-list-title')) > 0:
            t=print(news.select('.news-list-title')[0].text)
            a=news.select('a')[0].attrs
            print(a['href'])
     
            strd = requests.get(a['href'])
            strd.encoding = 'utf-8'
            soupd = BeautifulSoup(strd.text, 'html.parser')
            cont=soupd.select('#content')
            timet=soupd.select('.show-info')
            print(timet[0].text[0:25])
            print(timet[0].text[30:38])
            print(timet[0].text[38:45])
            print(timet[0].text[46:56])
            print(timet[0].text[62:])
  • 相关阅读:
    AGC007题解
    博弈论学习笔记
    ZROI2019 提高十连测
    男人八题2019
    LOJ 2840「JOISC 2018 Day 4」糖
    CF671D Roads in Yusland
    网络流套路小结
    BZOJ 3729 GTY的游戏
    AGC036C GP 2
    BZOJ 5046 分糖果游戏
  • 原文地址:https://www.cnblogs.com/Molemole/p/8709169.html
Copyright © 2011-2022 走看看